Designing Data Objects in C# and F#

Posted by: Yacoub Massad , on 3/25/2018, in Category Patterns & Practices
Views: 55656
Abstract: Learn how to design multi-threading safe data objects in C# and how to “modify” immutable objects using the DataObjectHelper Visual Studio extension. Also use F# to concisely create our data objects, and use them in C# projects.

In this article, I will give some recommendations on how to design data objects in C#. I will talk about immutability and about making invalid states unrepresentable. I will also present DataObjectHelper, a Visual Studio extension I created to help create data objects. I will also demonstrate how we can use F# to create data objects and use them from C# projects.

Are you keeping up with new developer technologies? Advance your IT career with our Free Developer magazines covering C#, Patterns, .NET Core, MVC, Azure, Angular, React, and more. Subscribe to the DotNetCurry (DNC) Magazine for FREE and download all previous, current and upcoming editions.

Data Objects in C# - Introduction

In a previous article, Data and Encapsulation in complex C# applications, I talked about behavior-only encapsulation among other things.

When doing behavior-only encapsulation, we separate data and behavior into separate units. Data units contain only data that is accessible directly, and behavior units contain only behavior. Such behavior units are similar to functions and procedures in functional and procedural programming, respectively.

behavior-only-encapsulation

Figure 1: Behavior-only encapsulation

In this article, I am going to give some recommendations on how to design these data units. I also call them data objects because in a language like C#, we create data units using classes; and as far as the language is concerned, instances of classes are “objects”.

Make your data objects immutable

This means that once a data object is created, its contents will never change.

In C#, we can do this by making all properties read-only and providing a constructor to allow the initialization of these properties like this:

public class TextDocument
{
    public TextDocument(string identifier, string content)
    {
        Identifier = identifier;
        Content = content;
    }

    public string Identifier { get; }

    public string Content { get; }
}

Benefits of immutability

By making data objects immutable, we get many benefits. I will only mention a couple of them here.

1. Data objects become thread-safe. This is true because all a thread can do to a data object is read from it. Having multiple threads reading something cannot cause a concurrency issue.

2. Code becomes easier to understand. For example, consider that you have some code that calls a function named DoSomething that has the following signature:

public static ProcessingResult DoSomething(TextDocument document)

To understand the effect of this function, we need to answer the following questions:

Q1. How is the return value ProcessingResult generated? In particular:

a. How is it affected by the input arguments? (the document argument in this case)

b. How is it affected by global state of the application (or even state external to the application such as a database)?

Q2. What effect on global state (or external state) does this function have?

Q3. What changes does this function do to the document argument?

By making data objects immutable, we can eliminate question number 3 and thus make understanding this method, easier.

To understand why this is important, consider the example where the DoSomething method passes the document argument to some other method, say Method1. Then, Method1 passes it to Method2 which passes it to Method3, and so on. Any of these methods can modify the TextDocument object passed to it. Now, in order to answer question 3, we need to understand how all of these methods modify the document.

By the way, if we make this function pure, we can also remove Q1.b and Q2. This would make the function even easier to understand. To learn more about pure functions, see the Functional Programming for C# Developers article here at DotNetCurry.

Modifying immutable data objects

Immutable data objects cannot be modified. If we need to have a modified version of a data object, we have to create another one that has the same data, except for the part that we want to change. For example, if we want to append “Thank you” to the content of a document, we would create a new document like this:

var newDocument =
    new TextDocument(
        existingDocument.Identifier,
        existingDocument.Content + Environment.NewLine + "Thank you");

What if a data object has ten properties and we just want to “modify” one property?

We need to invoke a constructor and pass ten arguments. Nine arguments from these will just be simple property access expressions like existingDocument.Identifier. That code would look ugly.

To fix this problem, we can create With methods. Consider these With methods for the TextDocument class:

public TextDocument WithIdentifier(String newValue)
{
    return new TextDocument(identifier: newValue, content: Content);
}

public TextDocument WithContent(String newValue)
{
    return new TextDocument(identifier: Identifier, content: newValue);
}

Each of these methods allows us to “modify” a specific property of an object. Here is how to use one:

var newDocument =
    existingDocument
        .WithContent(existingDocument.Content + Environment.NewLine + "Thank you");

The DataObjectHelper Visual Studio extension

Writing With methods for large data objects is a hard and error-prone task. To facilitate this, I have created a Visual Studio extension that enables the auto generation of such methods.

You can download the DataObjectHelper extension  via the Visual Studio’s Extensions and Updates tool from the Tools menu.

Once installed, you can use it by doing the following:

1. Create the following class anywhere in your code base:

[AttributeUsage(AttributeTargets.Class, AllowMultiple = true)]
public class CreateWithMethodsAttribute : Attribute
{
    public Type[] Types;

    public CreateWithMethodsAttribute(params Type[] types)
    {
        Types = types;
    }
}

2. Create a static class (of any name) to contain the With methods and apply the CreateWithMethods attribute to this class like this:

[CreateWithMethods(typeof(TextDocument))]
public static class ExtensionMethods
{
}

Note that when applying the attribute, we provide the type of data objects that we wish to generate With methods for. A single applied attribute can be given many types.

3. Hover over the CreateWithMethods attribute (where it is applied on the ExtensionMethods class). The icon for Roslyn based refactoring should show up. Look out for a refactoring with title “Create With methods”, and click on it.

create-with-methods

Figure 2: Creating With methods using the DataObjectHelper Visual Studio extension

Figure 2 shows how Visual Studio gives us a preview of refactoring with the With methods that the extension is about to make.

Please note that the extension creates With methods as extension methods, and not as instance methods inside the data object. One advantage of this is that you can create such methods for data objects that you don’t have the source code for.

Make invalid states unrepresentable

This means that the structure of the data objects should be designed in a way that only valid states can be represented by such objects. Consider this data class:

public enum ResultType { Success, PartialSuccess, Failure }

public class DocumentTranslationResult
{
    public DocumentTranslationResult(
        ResultType result,
        Document translatedDocument,
        ImmutableDictionary<int, ImmutableArray<string>> pageErrors,
        string error)
    {
        Error = error;
        PageErrors = pageErrors;
        TranslatedDocument = translatedDocument;
        Result = result;
    }

    public ResultType Result { get; }

    public string Error { get; }

    public ImmutableDictionary<int, ImmutableArray<string>> PageErrors { get; }

    public Document TranslatedDocument { get; }
}

This class represents the result of translating a text document (e.g. from English to Spanish). It has four properties.

The Result property can contain one of three values:

1. Success: the translation is 100% successful. In this case, the TranslatedDocument property would contain the translated document. All other properties would be null.

2. PartialSuccess: the translation is partially successful. In this case, the PageErrors property would contains a list of errors for each page in the source document. This is represented by a dictionary where the key is an integer representing the page number, and the value is an array of error messages for that page. The ImmutableDictionary and ImmutableArray types are similar to the standard Dictionary class in .NET and arrays in C#, except that they are immutable. For more information, see this article from Microsoft: https://msdn.microsoft.com/en-us/magazine/mt795189.aspx.

Also, the TranslatedDocument property would contain the translated document. We should expect to have some untranslated words or statements in this document. All other properties would be null.

3. Failure: Nothing could be translated. This can happen for example when a translation web service that we use, is offline. The Error property would contain an error message. Other properties would be null.

Note that the constructor of this class allows us to specify the values of the four properties when we construct the object. The problem with such design is that we can create an instance of DocumentTranslationResult that has an invalid state. For example, we can create an instance where Result has a value of Success but TranslatedDocument is null:

var instance =
    new DocumentTranslationResult(
        result: ResultType.Success,
        translatedDocument: null,
        pageErrors: null,
        error: serviceError); 

This is clearly a programmer mistake.

Either the programmer wanted to set Result to Failure but set it to Success by mistake, or she wanted to set error to null and translatedDocument to a specific value but forgot to do so (maybe as a result of copying and pasting some other code?).

One attempt to fix this problem is to throw exceptions on invalid states. Consider this updated constructor:

public DocumentTranslationResult(
    ResultType result,
    Document translatedDocument,
    ImmutableDictionary<int, ImmutableArray<string>> pageErrors,
    string error)
{
    if(result == ResultType.Success && translatedDocument == null)
        throw new ArgumentNullException(nameof(translatedDocument));

    if(result == ResultType.Failure && error == null)
        throw new ArgumentNullException(nameof(error));

    if (result == ResultType.PartialSuccess && pageErrors == null)
        throw new ArgumentNullException(nameof(pageErrors));

    if (result == ResultType.PartialSuccess && translatedDocument == null)
        throw new ArgumentNullException(nameof(translatedDocument));

    Error = error;
    PageErrors = pageErrors;
    TranslatedDocument = translatedDocument;
    Result = result;
} 

Now, if we try to construct the same invalid object as the example above, an exception will be thrown.

This is better since we will get an exception when we construct the object, instead of getting a NullReferenceException when we try to use the object later.

Still, we can only detect this problem at runtime. It would be better if we could detect and prevent this problem at compile-time. That is, if we get a compilation error trying to construct an invalid object.

Let’s now consider another example:

var instance =
    new DocumentTranslationResult(
        result: ResultType.Success,
        translatedDocument: document,
        pageErrors: null,
        error: serviceError); 

Again, this is clearly a programmer mistake.

This code will compile and will not give us a runtime exception when executed. When debugging the application, the programmer might hover over the instance variable and see a non-null value for the Error property and incorrectly think that this result represents a failed result.

We can mitigate this problem by throwing an exception when the result argument is Success and the error or the PageErrors arguments are non-null. Such invalid state checking in the constructor acts as documentation that helps programmers understand which states are valid and which are not.

There is a better option!

Let’s remove the constructor and add these three constructors instead:

public DocumentTranslationResult(Document translatedDocument)
{
    Result = ResultType.Success;
    TranslatedDocument = translatedDocument;
}

public DocumentTranslationResult(
    Document translatedDocument,
    ImmutableDictionary<int, ImmutableArray<string>> pageErrors)
{
    Result = ResultType.PartialSuccess;
    TranslatedDocument = translatedDocument;
    PageErrors = pageErrors;
}

public DocumentTranslationResult(
    string error)
{
    Result = ResultType.Failure;
    Error = error;
}

This is better. Now it is harder for the programmer to make the mistakes he/she made before. With this change, only the following three ways of constructing the object will result in a successful compilation:

var instance1 = new DocumentTranslationResult(document);

var instance2 = new DocumentTranslationResult(document, pageErrors);

var instance3 = new DocumentTranslationResult(serviceError); 

Now, these constructors act as the documentation for valid states.

Still, one can argue that by looking at the four properties of this class alone (without looking at the constructors), one cannot easily understand which combinations of values are valid and which are not.

Consider what happens if you want to use IntelliSense to view the properties of a variable of type DocumentTranslationResult. Here is how it looks like in Figure 3:

intellisense-document-translation

Figure 3: IntelliSense for DocumentTranslationResult

All that the programmer can determine from this is that this object has four properties.

This doesn’t tell her/him which combinations are valid and which are not. The programmer has to go and see the constructors to understand which combinations are valid and which are not.

Let’s say that the programmer wants to write a method to generate a formatted message about the result of translation. Consider this FormatResultForUser method that the programmer writes:

public string FormatResultForUser(DocumentTranslationResult result)
{
    switch (result.Result)
    {
        case ResultType.Success:
            return
                "Success. Translated document length: "
                + result.TranslatedDocument.Content.Length;

        case ResultType.Failure:
            return "Failure. Error: " + result.Error;

        case ResultType.PartialSuccess:
            return
                "Partial success. Number of errors: "
                + result.PageErrors.Values.Sum(x => x.Length)
                + ", Translated document length: "
                + result.TranslatedDocument.Content.Length;

        default:
            throw new Exception("Unexpected result type");
    }
}

In this code, the programmer is switching on the Result property to handle each case. One problem with this code is that when the programmer accesses the result parameter, all the properties are available. So, for example, in the Success case, the programmer should only access the TranslatedDocument property, however, nothing prevents her from accessing the Error property.

Can we find a better design that better communicates how to interpret and use the DocumentTranslationResult class?

Consider this alternative design:

public abstract class DocumentTranslationResult
{
    private DocumentTranslationResult(){}

    public sealed class Success : DocumentTranslationResult
    {
        public Success(Document translatedDocument)
        {
            TranslatedDocument = translatedDocument;
        }

        public Document TranslatedDocument { get; }
    }

    public sealed class Failure : DocumentTranslationResult
    {
        public Failure(string error)
        {
            Error = error;
        }

        public string Error { get; }
    }

    public sealed class PartialSuccess : DocumentTranslationResult
    {
        public PartialSuccess(
            ImmutableDictionary<int, ImmutableArray<string>> pageErrors,
            Document translatedDocument)
        {
            PageErrors = pageErrors;
            TranslatedDocument = translatedDocument;
        }

        public ImmutableDictionary<int, ImmutableArray<string>> PageErrors { get; }

        public Document TranslatedDocument { get; }
    }
}

In this updated design, the DocumentTranslationResult class is abstract and contains no properties whatsoever. It has three derived classes that are nested in it; Success, Failure, and PartialSuccess.

The Success class has only one property - TranslatedDocument, the Failure class also has one property - Error, and the PartialSuccess class has two properties: PageErrors and TranslatedDocument.

Like before, we have three constructors, one constructor per subclass. But now, each subclass has only some of the properties of the original class. It is easier to tell which properties are related with which result type.

Note also that the ResultType enum is no longer needed. The type of the result is encoded in the type system, i.e., as a derived class of DocumentTranslationResult. This has the additional benefit of preventing invalid enum values.

In C#, enum variables are not guaranteed to hold a valid value. By default, the underlying type behind an enum is int. This means that any int value can be converted to ResultType. For example, (ResultType)100 is a valid value to be assigned to a variable of type ResultType.

Note also that there is a parameterless constructor of DocumentTranslationResult that is private. This gives us better control over which classes can inherit from DocumentTranslationResult. Usually, if the base class constructor is private, no classes can inherit from it. However, nested classes are special in that they can access the private members of the classes they reside in.

Now, if we want to modify this data object to add another result type, we can create another nested class inside DocumentTranslationResult that derives from it. Such a class has to be a nested class in DocumentTranslationResult or the code will not compile.

Now, if we have a variable of type DocumentTranslationResult.PartialSuccess and we try to use IntelliSense to view its properties, we can see the two relevant properties, as seen in Figure 4.

partial-success-intellisense

Figure 4: IntelliSense for DocumentTranslationResult.PartialSuccess

However, if we have a variable of type DocumentTranslationResult(which can hold an instance of any of the three subclasses), we get the following in IntelliSense (see Figure 5).

intellisense-abstract-class

Figure 5: IntelliSense for DocumentTranslationResult abstract class

This is very bad!

Now, we have no idea how to work with this data object. We now have to go to the class definition and read it. It is only after we have read it, do we know about the three subclasses and the properties they contain.

Before I propose a fix for this, let’s see how the FormatResultForUser method looks like now:

public string FormatResultForUser(DocumentTranslationResult result)
{
    switch (result)
    {
        case DocumentTranslationResult.Success successResult:
            return
                "Success. Translated document length: "
                + successResult.TranslatedDocument.Content.Length;

        case DocumentTranslationResult.Failure failureResult:
            return "Failure. Error: " + failureResult.Error;

        case DocumentTranslationResult.PartialSuccess partialSuccessResult:
            return
                "Partial success. Number of errors: "
                + partialSuccessResult.PageErrors.Values.Sum(x => x.Length)
                + ", Translated document length: "
                + partialSuccessResult.TranslatedDocument.Content.Length;

        default:
            throw new Exception("Unexpected result type");
    }
}

Similar to the FormatResultForUser method from before, we use a switch statement to switch on the type of the result. The ability to switch over the type of a specific variable is only available as of C# 7, as a result of introducing the pattern matching feature.

Notice that for each case, we are defining a new variable. For example, for the Success case we have a variable called successResult of type DocumentTranslationResult.Success. This means that we can only access properties related to the success case via this variable. Contrast this with the FormatResultForUser method from before. There, we had a single variable result that gave access to all the properties.

For more information about the pattern matching feature of C# 7, read the C# 7 - What's New article here at DotNetCurry.

Still, as I showed earlier, a variable of type DocumentTranslationResult does not tell us anything useful in IntelliSense. One way to fix this is to add a Match method to the DocumentTranslationResult class like this:

public TResult Match<TResult>(
    Func<Success, TResult> caseSuccess,
    Func<Failure, TResult> caseFailure,
    Func<PartialSuccess, TResult> casePartialSuccess)
{
    switch (this)
    {
        case Success success:
            return caseSuccess(success);
        case Failure failure:
            return caseFailure(failure);
        case PartialSuccess partialSuccess:
            return casePartialSuccess(partialSuccess);
        default:
            throw new Exception(
                "You added a new subtype of DocumentTranslationResult without updating the Match method");
    }
}

The name of the method Match comes from pattern matching.

Now if we use IntelliSense to access a variable of type DocumentTranslationResult, we see the Match method as shown in Figure 6:

intellisense-for-matchmethod

Figure 6: IntelliSense of the DocumentTranslationResult.Match method

Using IntelliSense, we know that the translation result can be of three types; Success, Failure, and PartialSuccess. Here is how the updated FormatResultForUser method looks like:

public string FormatResultForUser(DocumentTranslationResult result)
{
    return result.Match(
        caseSuccess: success =>
            "Success. Translated document length: "
            + success.TranslatedDocument.Content.Length,
        caseFailure: failure =>
            "Failure. Error: " + failure.Error,
        casePartialSuccess: partialSuccess =>
            "Partial success. Number of errors: "
            + partialSuccess.PageErrors.Values.Sum(x => x.Length)
            + ", Translated document length: "
            + partialSuccess.TranslatedDocument.Content.Length);
}

For each result type, we provide a lambda that takes the specific subtype and gives back a value. This means that in each case, we only have access to the properties defined in the corresponding subtype.

The Match method is a generic method that has a generic type parameter TResult. This allows us to use it to return a value of any type. Of course, all the lambdas passed to the Match method must return a value of the same type.

The Match method enforces the matching to be exhaustive, that is, the code using the Match method has to handle all cases or otherwise it won’t compile. Compare this with when we used a switch statement. When using a switch statement, we can forget to include a specific case and the compiler will not complain.

Sometimes, we need to deal with an instance of DocumentTranslationResult in a way such that we don’t want to extract or generate some data from it, instead we might want to simply invoke some code for each case.

For example, we might want to write some data from the object to the console. The following Match method overload helps us achieve this:

public void Match(
    Action<Success> caseSuccess,
    Action<Failure> caseFailure,
    Action<PartialSuccess> casePartialSuccess)
{
    switch (this)
    {
        case Success success:
            caseSuccess(success);
            break;
        case Failure failure:
            caseFailure(failure);
            break;
        case PartialSuccess partialSuccess:
            casePartialSuccess(partialSuccess);
            break;
        default:
            throw new Exception(
                "You added a new subtype of DocumentTranslationResult without updating the Match method");
    }
}

Instead of taking Func delegates, this method takes Action delegates. The method will call the appropriate action based on the type of the result.

Using the DataObjectHelper extension to create Match methods

The DataObjectHelper extension can also help with generating Match methods. Here is how to use it to do so:

1. Create the following class anywhere in the codebase:

[AttributeUsage(AttributeTargets.Class, AllowMultiple =true)]
public class CreateMatchMethodsAttribute : Attribute
{
    public Type[] Types;

    public CreateMatchMethodsAttribute(params Type[] types)
    {
        Types = types;
    }
}

2. Create a static class (of any name) to contain the Match methods and apply the CreateMatchMethods attribute to this class like this:

[CreateMatchMethods(typeof(DocumentTranslationResult))]
public static class ExtensionMethods
{
}

3. Hover over the CreateMatchMethods attribute (where it is applied on the ExtensionMethods class). The icon for Roslyn based refactorings should show up. A refactoring with title “Create Match methods” should show up, click on it.

Like With methods, Match methods are created as extension methods.

Other issues

Let’s look at the DocumentTranslationResult.PartialSuccess class. This class has a property of type ImmutableDictionary<int, ImmutableArray<string>> called PageErrors. This property has these issues:

Issue 1: The page number is represented using an integer. Integers can have the value 0 and even negative values. However, a page number cannot be zero or negative. The current design allows us to create an instance of DocumentTranslationResult.PartialSuccess that has errors for page -5 for example.

Of course, we can fix this problem by throwing exceptions in the constructor of the DocumentTranslationResult.PartialSuccess class. However, a better way to communicate that page numbers have to be positive, is to create a type for positive integers like the following:

public class PositiveInteger
{
    public int Value { get; }

    public PositiveInteger(int value)
    {
        if (value < 1)
            throw new ArgumentException("value has to be positive", nameof(value));

        Value = value;
    }
}

And change the PageErrors property like this:

public ImmutableDictionary<PositiveInteger, ImmutableArray<string>> PageErrors { get;}

At runtime, we are still protecting against negative page numbers by throwing exceptions. However, this new design communicates in a better way which page numbers are valid and which are not. Also, instead of having an exception-throwing-code duplicated in all places that needs to accept page numbers, we centralize this check in the PositiveInteger class.

There are ways to get better compile-time invalid state checking here. For example, we could write an analyzer  that prevents the code from compiling if we construct a PositiveInteger class using a non-positive integer literal.

Additionally, we could make the analyzer prevent construction of this class using a non-literal integer. A special static TryCreate method can be used to allow the construction of PositiveInteger instances using a non-literal int:

public static bool TryCreate(int value, out PositiveInteger positiveInteger)
{
    if (value < 1)
    {
        positiveInteger = null;
        return false;
    }

    positiveInteger = new PositiveInteger(value);

    return true;
}

This has the advantage of making it very clear to the developer who is creating an instance of a PositiveInteger that not all instances of int can be converted to PositiveInteger . The developer has to try and check if such conversion was successful.

A better signature for this method would use the Maybe monad (more on this later):

public static Maybe<PositiveInteger> TryCreate(int value)

To make the design even more descriptive, we can use a PageNumber class (that contains a PositiveInteger) for the key of the dictionary to represent the page number.

Issue 2: String is used as the type of error. A better option is to create an ErrorMessage class to better communicate that this represents an error message.

Issue 3: The dictionary and the array can be empty. However, if the document is partially successful, then at least one page must contain at least one error.

Again, we can throw exceptions at runtime in the DocumentTranslationResult.PartialSuccess class constructor if the dictionary is empty or the error array for any page is empty. However, it is much better to create a NonEmptyImmutableDictionary and NonEmptyImmutableArray classes in the same way we created the PositiveInteger class.

Also, to make the design more descriptive, we can create a PageErrors data object to represent page errors. Such object would simply contain the dictionary property.

Note: The tendency to use a primitive type (e.g. int) instead of a special class (e.g. PageNumber) to represent a page number is called primitive obsession.

Don’t use null to represent optional values

If you are reading a data class and you see a property of type string, can you tell whether it is optional or required? In other words, is it valid for it to be null?

We sometimes use the same type, e.g. string, to represent required and optional values, and this is a mistake.

Model your data objects in a way that makes it clear which values are optional and which are not. The Maybe monad (also called the Option monad) can be used to mark a value, e.g. a property, as optional. Consider this Customer data object:

public class Customer
{
    public Address PrimaryAddress { get; }

    public Maybe<Address> SecondaryAddress { get; }

    //..
}

It is very clear from this design that the primary address is required, while the secondary address is optional.

A very simple implementation of the Maybe monad looks like this:

public struct Maybe<T>
{
    private readonly bool hasValue;

    private readonly T value;

    private Maybe(bool hasValue, T value)
    {
        this.hasValue = hasValue;
        this.value = value;
    }

    public bool HasValue => hasValue;

    public T GetValue() => value;

    public static Maybe<T> OfValue(T value)
    {
        if (value == null)
            throw new Exception("Value cannot be null");

        return new Maybe<T>(true, value);
    }

    public static Maybe<T> NoValue()
    {
        return new Maybe<T>(false, default);
    }
}

There are many open source C# implementations available for the Maybe monad. When creating the DataObjectHelper extension, I wrote one implementation. You can find it here: https://github.com/ymassad/DataObjectHelper/blob/master/DataObjectHelper/DataObjectHelper/Maybe.cs

Sum types and Product types

The DocumentTranslationResult class I used in the example (as is finally designed) is sum type. A sum type is a data structure that can be any one of a fixed set of types. In our example, these were the Success, Failure, and PartialSuccess types.

These three sub types themselves are product types. In simple terms, a product type defines a set of properties that an instance of the type needs to provide values for. For example, the DocumentTranslationResult.PartialSuccess type has two properties; PageErrors and TranslatedDocument. Any instance of DocumentTranslationResult.PartialSuccess must provide a value for these two properties.

Sum types and product types can be composed to form complex data objects. The DocumentTranslationResult example in this article has shown a sum type whose possible subtypes are product types. But this can be even deeper. For example, the Success case can hold a property of type TranslationMethod which can be either Local or ViaWebService. This TranslationMethod type would be a sum type. Again, the Local and ViaWebService types can be product types each containing details specific to them. For example, the ViaWebService type can have a property to tell which web service was used.

Design your data classes so that they are either sum types or product types. In C#, this means that a data object should either be abstract and have no properties and have derived types to present cases for this type, or it should be a sealed class and contain properties.

If you design your data objects this way, then you only need Match methods for sum types and With methods for product types.

By the way, the Maybe monad is a sum type because it can either have a value or not have a value. In my implementation, I use a struct for technical reasons. However, I have seen implementations of it that use two classes to represent the two cases.

Sum types and product types and combinations of these are called Algebraic Data Types. These types have very good support in functional programming languages. In the next section, I will show you how you can easily create your data objects in F# and use them from your C# projects.

Note: The phrase “make illegal states unrepresentable” was introduced by Yaron Minsky. For more information, see this video: https://blog.janestreet.com/effective-ml-video/

Using F# to create data objects and use them in C# projects

F# is a .NET functional programming language. Although it is a functional language, it also has object-orientation support. Because it compiles to MSIL as any .NET language does, code written in F# can be consumed in C#, and vice-versa.

F# has great support for algebraic data types. Consider this code:

module MyDomain =
    type Document = {Identifier : string; Content: string;}

    type imdictionary<'k,'v> = System.Collections.Immutable.IImmutableDictionary<'k,'v>
    type imarray<'e> = System.Collections.Immutable.ImmutableArray<'e>

    type DocumentTranslationResult =
        | Success of TranslatedDocument : Document
        | Failure of Error: string
        | PartialSuccess of PageErrors : imdictionary<int, imarray<string>> * TranslatedDocument : Document

The code starts by creating a module.

A module in F# is simply grouping of F# code. This module contains a Document record. A record in F# defines a set of named values. In C# terms, this is like a data class which contains a set of properties. A record in F# is a product type.

The Document record has two values (properties in C#); Identifier of type string and Content of type string.

Next, we define two type aliases; imdictionary and imarray. This is simply to make it easier to use the ImmutableDictionary and ImmutableArray types.

Next, we define what is called a discriminated union in F#. This is a sum type.

A discriminated union allows us to create a type that can be one of a set of fixed types. In this case we define the DocumentTranslationResult discriminated union that can either be Success, Failure, or PartialSuccess. Notice how we define the fields (or properties in C#) for each case.

Now compare the amount of code we write here versus the amount of code we need to write in C#.

We can write our data objects as F# records and discriminated unions and then write our units of behavior in C#.

Note: Types created with F# are somewhat similar to the types I described in this article when viewed from C#.

There are some differences though.

The DataObjectHelper library can be used to create With and Match methods for types created in F#. The static class that will contain these methods must exist in a C# project, but the types themselves can come from F#. The DataObjectHelper library supports creating With and Match methods for all the types in a specific module. A module in F# is compiled as a static class in MSIL. If you specify this class when applying the CreateWithMethods and the CreateMatchMethods attributes, With and Match methods for all types within this module will be generated.

By the way, in the future, C# is expected to have records similar to F# records. You can see the proposal here: https://github.com/dotnet/csharplang/blob/master/proposals/records.md

A data object should represent one thing only

Don’t be tempted to use a single data class for many purposes. For example, if there is another translation process in your application that can either succeed or fail (cannot be partially successful), don’t reuse the same DocumentTranslationResult class because it models a result that can also be partially successful. In this case, create another data object to model this new result type.

Non-closed data type hierarchies

There are cases where it makes sense to have something similar to a sum type but whose subtypes are not fixed. That is, other cases can be added later by a developer who is not the author of the type. I chose to make this topic out of scope for this article, but I might discuss it in a future article.

Conclusion:

In this article, I have given some recommendations on how to design data objects in C#.

I talked about making the data objects immutable to make multithreading safer and to make the program easier to understand. I also talked about making invalid states unrepresentable. We make invalid states unrepresentable by modeling data objects carefully as sum types or product types.

I talked about making it easier to “modify” immutable objects by creating With methods and how the DataObjectHelper Visual Studio extension helps us generate such methods.

Also, for sum types, the DataObjectHelper extension can help us generate Match method to easily and safely switch over the different cases represented by sum types.

I also talked about how we can use F# to concisely create our data objects, and use them in C# projects.

This article was technically reviewed by Damir Arh.

This article has been editorially reviewed by Suprotim Agarwal.

Absolutely Awesome Book on C# and .NET

C# and .NET have been around for a very long time, but their constant growth means there’s always more to learn.

We at DotNetCurry are very excited to announce The Absolutely Awesome Book on C# and .NET. This is a 500 pages concise technical eBook available in PDF, ePub (iPad), and Mobi (Kindle).

Organized around concepts, this Book aims to provide a concise, yet solid foundation in C# and .NET, covering C# 6.0, C# 7.0 and .NET Core, with chapters on the latest .NET Core 3.0, .NET Standard and C# 8.0 (final release) too. Use these concepts to deepen your existing knowledge of C# and .NET, to have a solid grasp of the latest in C# and .NET OR to crack your next .NET Interview.

Click here to Explore the Table of Contents or Download Sample Chapters!

What Others Are Reading!
Was this article worth reading? Share it with fellow developers too. Thanks!
Share on LinkedIn
Share on Google+

Author
Yacoub Massad is a software developer and works mainly on Microsoft technologies. Currently, he works at Zeva International where he uses C#, .NET, and other technologies to create eDiscovery solutions. He is interested in learning and writing about software design principles that aim at creating maintainable software. You can view his blog posts at criticalsoftwareblog.com. He is also the creator of DIVEX(https://divex.dev), a dependency injection tool that allows you to compose objects and functions in C# in a way that makes your code more maintainable. Recently he started a YouTube channel about Roslyn, the .NET compiler.



Page copy protected against web site content infringement 	by Copyscape




Feedback - Leave us some adulation, criticism and everything in between!