By applying the SOLID principles, the Single Responsibility Principle (SRP) in particular, our code base becomes a large set of small classes. Each of these classes is responsible for doing one little thing. As the result of the small individual things that these classes do, we get a large and complex application behavior by composing instances of these classes together.
These small classes have to be highly composable to make it easy for us to compose them together to get the complex behavior that we want for the application. Classes become composable by having their dependencies injected into them instead of them creating these dependencies; and by having dependencies on abstract types or interfaces, instead of concrete classes. This behavior is what Dependency Injection (DI) is all about.
This article is published from the DNC Magazine for Developers and Architects. Download this magazine from here [Zip PDF] or Subscribe to this magazine for FREE and download all previous and current editions
For the purpose of demonstrating these ideas, I have created an example C# application on GitHub. I am going to discuss the changes I made to that application to modify its behavior, paying particular attention to the role of object composition in this process of behavior modification.
Unlike the other articles that I have published here at the DNC magazine, you will have to download and at least examine the application code if you want to benefit the most from this article.
Object Composition application
The main feature of the application, the Document Indexer application, is to read text documents from the file system, extract the list of distinct words for each document and store the documents along with their extracted words into the database. Such information can be used later by another application to search documents quickly .
Although the main feature of the application is simple, we are going to extend the application to add more features. More specifically, we are going to add features to detect already processed documents, record performance information, track the number of processed documents, move processed documents to another folder, retry database access in case of errors, record database access errors to the Event Log, and finally run the whole process continuously.
Please note that in this application, I am not going to care about many of the software development practices, e.g. testing, project structure, and data access patterns. I am also not going to care much about the performance of the application as well. The primary focus of this example is to show object composition in action.
The URL for the GitHub repository where the example resides is https://github.com/ymassad/CompositionExample
The initial build
You can browse to the initial build using the following URL: https://github.com/ymassad/CompositionExample/tree/InitialApplication
In order to run the application, you need to do the following:
1) Make sure that you have SQL Server installed (any version will suffice)
2) Copy the Documents folder that you find in the repository to any place on your file system. This folder contains 100 sample documents.
3) Open the solution file from Visual Studio. I have used Visual Studio 2015 community edition to create the solution, but it will also work with Visual Studio 2013.
4) Open Settings.Xml, change the ConnectionString element value to match the settings of your SQL server.
5) In Settings.Xml, also change the FolderPath element value to match the path where you stored the documents.
6) Build the solution and run the program.
Go ahead, download the application, and take a look at the code. Most importantly take a look at the Composition Root in Program.cs.
Here is the object graph that is created in the Composition Root:
The DocumentGrabberAndProcessor object is the root object, and it implements the IRunnable interface so that it can be triggered to do its job. It invokes its IDocumentsSource dependency to obtain documents and passes them to its IDocumentProcessor dependency to process these documents.
Currently, we have a FileSystemDocumentsSource that gets documents from the file system. For IDocumentProcessor, we have an IndexProcessor class that extracts words from the document via its IWordsExtractor dependency and then uses its IDocumentWithExtractedWordsStore dependency to store documents along their extracted words.
Having these interfaces is the way that we make the system modifiable. These interfaces are our extension points. In the next section, we will use such extension points to add new functionality to the system. Now let’s run the application.
After running the application, a new database will be created in SQL Server called DocumentIndexer (unless you changed its name in the connection string). You will find two tables; the Documents table and the IndexEntries table.
Each row in the Documents table will contain the document filename and the document content. For each distinct word in each document, you will find a row in the IndexEntries table that specifies the word, and that refers to the corresponding row in the Documents table.
A feature request: We need the program to be aware of the documents it processed already
As it currently works, the program will always take all the documents, process them, and store the data in the database, even if documents are already processed before.
We need to change the behavior of the system to check if documents are already processed before we process them. What is the seam that allows us to inject such new behavior?
I see two options; we could create a decorator for IDocumentsSource that filters the processed documents out, or we could create an IDocumentProcessor that skips a document if it is already processed. Let’s go with option 1.
We will create a decorator for IDocumentsSource that communicates with the database, and filters out the already processed documents before returning the results to the consumer. We will call it ProcessedDocumentsAwareDocumentsSource. Here is how the relevant part of the object graph looks now:
Go ahead and see how the code looks now. The URL for this commit is: https://github.com/ymassad/CompositionExample/tree/ProcessedDocumentsAware
A note about performance
You might have noticed that the solution that we used above is bad regarding performance. One solution is to refactor the FileSystemDocumentsSource to split the responsibility of discovering the files (getting file names) from the responsibility of reading the files (getting content). The responsibility of discovering the files would be moved to another interface that we can decorate later to filter files that are already processed.
In this article, I am focusing on showing the power of object composition, so I don’t want to do a lot of refactoring. Later in this article, we will add a feature to move processed documents to another folder which will fix this problem.
A feature request: We need to monitor the time it takes to save a document to the database
In this feature, we need to monitor calls to store documents in the database, measure the time, and record it somewhere. For now, we will simply record it to a text file.
We will create a decorator for IDocumentWithExtractedWordsStore that measures the time it takes to store the document (by invoking the decorated object) and then invokes an IPerformanceRecorder dependency to record the time. For now, we are going to create an implementation of IPerformanceRecorder that records the time to a new line in a text file. Here is how the relevant part of the object graph looks like now:
Go ahead and view the code for the new commit at https://github.com/ymassad/CompositionExample/tree/RecordPerformanceToFile
Please note that I have added the location of the text file as a setting in Settings.XML. You might want to change the value of this setting before running the application.
Update: Record the time to a Performance Counter instead of a text file
The product owner tells us that it is not acceptable to record the time to a text file because it is hard to manage. So we decided to use Performance Counters to record the time. More specifically, we decided to record the average time it takes to save a document to the database.
Performance Counters in the Windows operating system are objects that we can use to record performance data. Although you can understand the changes to the code without understanding Performance Counters in details, you might want to read a bit about performance counters and how to use them from .NET before you continue. The following is a link to an article that can help you with that: https://support.microsoft.com/en-us/kb/316365
We currently have an IPerformanceRecorder interface. All we need to do is create a new implementation of such an interface that records to some Performance Counter, and then inject an instance of the new class into the appropriate place. Here is how the relevant part of the object graph looks like after the change:
We simply created the AveragePerformanceCounterBasedTimeRecorder class and injected an instance in the place where we used to have a SimpleFileBasedPerformanceRecorder instance.
Go ahead a take a look at the code at: https://github.com/ymassad/CompositionExample/tree/RecordPerformanceToCounter
Please note that I have created a method called EnsurePerformanceCountersAreCreated that I call at application startup. This method creates the Performance Counters at the operating system level. Usually, such code should exist in installation wizards or deployment scripts. The only reason I put it here is to make it easy for you to run the example.
Please also note that you have to be an administrator or part of the Performance Monitor Users group to be able to access (create and use) Performance Counters.
A feature request: We need to record to a Performance Counter the number of documents that have been processed
We now need to keep track of the number of documents we process. We decide to track it via a Performance Counter. Where to inject this new behavior?
It seems that we need to change the object graph somewhere around the IDocumentProcessor interface. We can use the composite pattern to create a document processor that invokes two document processors (or more). We can use it to invoke the current document processor, i.e., the IndexProcessor and a new processor that just increments a Performance Counter. Here is how the relevant part of the object graph looks like now:
Go ahead and see how the code looks like now: https://github.com/ymassad/CompositionExample/tree/TrackNumberOfDocuments
A feature request: We need to move processed documents to a new folder
After a document is done processing, we need to move it to a new folder. To do so, we are simply going to create a new implementation of IDocumentProcessor (namely the DocumentMover) that moves the document to some folder. We then inject it as a third document processor in the CompositeDocumentProcessor instance. Here is how the relevant part of the object graph looks like after the change:
Now, since we are moving the processed files into a different folder, we no longer need to check if the files in the source folder are already processed. Therefore, we remove the ProcessedDocumentsAwareDocumentsSource instance from the object graph. Here is how the relevant part of the graph looks like:
Here is the link for this commit:
A feature request: We need to retry storing documents to the database in case of a database error
Sometimes when we try to store a document, the operation fails and we get an exception. Sometimes the reason of failure is transient. What we would like to do is to retry to store the document in case of an error. We could, for example, retry the operation five times before giving up.
We are going to create a decorator for IDocumentWithExtractedWordsStore that retries the invocation of the method on the decorated store. We will call this decorator RetryAwareDocumentWithExtractedWordsStore for now. We will make it possible to inject different retry strategies by making this class depend on an IRetryWaiter interface. The current implementation of IRetryWaiter, the IncrementalTimeRetryWaiter, increments the wait time by a constant amount in every retry.
Here is how the relevant part of the object graph looks like now:
See how we injected an instance of the new class on top of the PerformanceAwareDocumentWithExtractedWordsStore object (as a decorator for it). We could have done it the other way around. Consider the following potential change in the relevant part of the object graph:
What would be the difference in this case? Think about it for a minute.
The main difference is that in the first case, the recorded time (to the Performance Counter) would be for a single successful attempt to store the document into the database (which is what we want). In the second case, the recorded time is the total time consumed to store the document successfully into the database including the time for possible unsuccessful attempts.
This is an example of how only composing the objects in a different way causes the system to behave differently.
The link for the new commit is: https://github.com/ymassad/CompositionExample/tree/RetryStoreDocuments
A feature request: We need to log to the event log any errors that occur during storing the document in the database
Retrying the operation is not enough, we need to log the error to the event log.
Easy! We need another decorator for IDocumentWithExtractedWordsStore.
We create the ErrorAwareDocumentWithExtractedWordsStore class and inject it into the appropriate place. Here is how the object graph looks like now:
Notice that the ErrorAwareDocumentWithExtractedWordsStore class has a dependency on IErrorReporter. This allows us to vary how we report the error later. For now, we provide an implementation of IErrorReporter that writes to an Event Log.
Here is the link for the commit: https://github.com/ymassad/CompositionExample/tree/RecordErrors
Please note that the code will create an event log source named DocumentIndexer. You need to have administrative privileges for this to work.
A feature request: We need the program to keep pulling new documents as long as it is running
Currently, the program will pull documents once from the folder, process them, and then exit. What we would like to do is make it keep pulling new documents (that users might drop into the input folder) until the user chooses to quit the application.
We create a new interface named ICancellableRunnable that allows an operation to run but also supports cancelling that operation by taking a CancellationToken. We then create a ContinuousRunnable class that keeps invoking an IRunnable dependency as long as it is not asked to cancel.
Here is how the relevant part of the object graph looks like after we create these types:
Here is the link for this commit:
When we write SOLID code, we create a lot of small classes, each having a single responsibility. We create an application that has a complex behavior by composing these small classes in a specific way. Classes are highly composable because they depend on interfaces instead of concrete types.
This article provided an example of how an application can evolve and highlighted the role of object composition in the process. The code for this example is provided on GitHub so that readers can examine it in detail.
Download the code at https://github.com/ymassad/CompositionExample/