The Maybe Monad (C#)

Posted by: Yacoub Massad , on 9/25/2019, in Category Patterns & Practices
Views: 3120
Abstract: In this tutorial, I will talk about the Maybe Monad; a container that represents a value that might or might not exist.

In C#, null is a valid value for variables whose types are reference types (e.g. classes).

For example, this is valid C# code:

string str = null;

Additionally, if a method has a return type that is a reference type, null is a valid return value:

string GetLogContents(int id)
{
    var filename = "c:\\logs\\" + id + ".log";

    if (File.Exists(filename))
        return File.ReadAllText(filename);

    return null;
}

The above method returns null if a log file that corresponds to the requested id, is not found.

The caller of the GetLogContents method will receive a string after the method is called. Should the calling method check the value for null before using it?

In this case, it should.

If it doesn’t and the method returns null, a NullReferenceException is thrown if the calling method tries to access a member of the returned string.

What’s more dangerous is that the returned string might be passed several times from method to method before a member of the string is finally used. It is only when a member of the string is used that the NullReferenceException is thrown. This might make it hard to figure out the real cause of the NullReferenceException.

Are you a .NET/C# developer looking for a resource covering New Technologies, in-depth Tutorials and Best Practices?

Well, you are in luck! We at DotNetCurry release a digital magazine once every two months aimed at Developers, Architects and Technical Managers and cover ASP.NET Core, C#, Patterns, .NET Core, ASP.NET MVC, Azure, DevOps, ALM, TypeScript, Angular, React, and much more. Subscribe to this magazine for FREE and receive all previous, current and upcoming editions, right in your Inbox. No Gimmicks. No Spam Policy.

Click here to Download the Magazines For Free

A string return type can also be used in methods that never return null. In these cases, the caller shouldn’t have to check for null.

In C# (before C# 8), there is no way to distinguish between the two cases. Methods that never return null and methods that might return null, both have the return type string.

Note: C# 8 is expected to have nullable reference types. This means that methods that might return null can have the return type of string? and methods that never return null can have the return type string.

Editorial Note: If you haven’t yet read about the new C# 8 features, read them here > New C# 8 Features in Visual Studio 2019. C# 8 is currently in preview at the time of this writing and the article will be updated soon enough.

Here is an updated version of the GetLogContents method:

Maybe<string> GetLogContents(int id)
{
    var filename = "c:\\logs\\" + id + ".log";

    if (File.Exists(filename))
        return File.ReadAllText(filename);

    return Maybe.None;
}

The signature of the method now tells us that it may return a string or it may not return a value.

Now, methods that always return a string (non-null) can have a return type of string, and methods that may return a string can have a return type of Maybe<string>.

In the rest of the article, I will:

  • discuss different implementations of the Maybe type
  • talk about Map and Bind
  • show you how to use LINQ to work with Maybe values

A sum-type implementation

Note: the source code for this section is found here: https://github.com/ymassad/MaybeExamples/tree/master/MaybeAsASumType

In the Designing Data Objects in C# and F# article, I talked about sum types. A sum type is a data structure that can be any one of a fixed set of types. For example, we can define a Shape sum type that has the following three sub-types:

1. Square (int sideLength)

2. Rectangle (int width, int height)

3. Circle (int diameter)

Maybe can be designed to be a sum type in the following way:

public abstract class Maybe<T>
{
    private Maybe()
    {
    }

    public sealed class Some : Maybe<T>
    {
        public Some(T value) => Value = value;
        public T Value { get; }
    }

    public sealed class None : Maybe<T>
    {
    }
}

Maybe<T> is a sum type. It has two subtypes: Some and None. This means that a variable of type Maybe<T> can only hold an instance of type Maybe<T>.Some or Maybe<T>.None.

Note that it cannot hold an instance of type Maybe<T> because Maybe<T> is abstract. Also, the private constructor means that no other classes can inherit from Maybe<T>.

The Some subtype has a single property, the Value property. The None subtype has no properties because it models the case where the value is missing.

Now, after a caller calls the GetLogContents method and gets a Maybe<string>, it can do something like the following:

var contents = GetLogContents(1);

if (contents is Maybe<string>.Some some)
{
    Console.WriteLine(some.Value);
}
else
{
    Console.WriteLine("Log file not found");
}

Here, we use the pattern matching feature of C#7 to check if the contents variable is of type Maybe<string>.Some. If so, write the log contents to the console. Otherwise, inform the user that the log file is not found.

Note that here, there is no way (or at least it is hard) to access the value without first making sure that there is actually a value.

Still, this does not look very elegant.

The need to write Maybe<string>.Some here is inconvenient. The fact that we have to define the some variable here and then access the Value property, is also inconvenient.

One thing we can do is create a TryGetValue method in Maybe:

public bool TryGetValue(out T value)
{
    if (this is Some some)
    {
        value = some.Value;
        return true;
    }

    value = default(T);
    return false;
}

This method returns true if there is a value, and false otherwise. Additionally, if there is a value, the value out parameter will get the contained value.

Here is how it can be used:

if (contents.TryGetValue(out var value))
{
    Console.WriteLine(value);
}
else
{
    Console.WriteLine("Log file not found");
}

This is better since we don’t have to type the full type of the variable (e.g. Maybe<string>.Some). Also, we don’t have to define a some variable.

However, there is a bigger issue here. Now, the value variable can be accessed anywhere, even in the else branch where it does not contain a valid value.

A Roslyn analyzer can be built to prevent access to the value variable in a location where TryGetValue is not known to have returned true.

Another option is to define a Match method for Maybe. I talked about Match methods in the Designing Data Objects in C# and F# article.

Here is how the consuming code would look like:

var contents = GetLogContents(1);

contents.Match(some: value => 
{
    Console.WriteLine(value);
},
none: () =>
{
    Console.WriteLine("Log file not found");
});

The source code of the Match method used above is defined here.

Similar to the first solution, the value lambda parameter can only be accessed when there is a value.

One potential issue with this solution is performance. Using lambdas to describe how to handle the two different cases (some and none) might allocate objects that will later need to be garbage collected.

In many cases, this is not an issue. Always measure when it comes to performance.

Note: there is another overload of Match that allows you to return something in each case instead of doing something in each case.

Defining Maybe as a struct

Note: The source code for this section can be found here: https://github.com/ymassad/MaybeExamples/tree/master/MaybeAsAStruct

Consider this method:

static Maybe<string> GetLogContents(int id)
{
    return null;
}

In the previous section, because we defined Maybe as a class, null is a valid value. That is, Maybe<string> can actually be one of three things: null, Maybe<string>.None and Maybe<string>.Some.

This method returns null and therefore the consuming code will most likely not behave as expected. For example, calling the Match method on null will throw a NullReferenceException.

Also, the following if statement:

if (contents is Maybe<string>.None)
{

}

..will evaluate to false because null is not equivalent to Maybe<string>.None.

Because of this, it makes sense to define Maybe as a struct.

Structs in C# cannot have the value null. For example, this code is invalid:

int a = null;

Therefore, if we define Maybe as a struct, we are guaranteed that it will never have the value null.

The struct version of Maybe can be found here. Here is a some of the code:

public struct Maybe<T>
{
    private readonly T value;

    private readonly bool hasValue;

    private Maybe(T value)
    {
        this.value = value;
        hasValue = true;
    }
    //...
}

Structs always have a public parameterless constructor that initializes all fields to their default values. This means that if we construct Maybe<string> like this:

var maybe = new Maybe<string>();

..the value field will get the value of null (the default for string), and the hasValue field will get the value of false (the default of bool). This will indicate that this instance contains no value.

The constructor defined above always sets hasValue to true. It is private, so it can only be used from within the class. It is used in the following member:

public static implicit operator Maybe<T>(T value)
{
    if(value == null)
        return new Maybe<T>();

    return new Maybe<T>(value);
}

 

This is the declaration of an implicit operator for converting T to Maybe<T>. This means that we can assign a string value to a variable of type Maybe<string>. It also means that we can return a string value inside a method that has Maybe<string> as the return type.

The code here checks the value for null. If it is null, it returns a Maybe instance that contains no value. Otherwise, it uses the defined constructor to return a Maybe instance that contains the value.

This operator is the reason why the GetLogContents method returns the result of calling the File.ReadAllText method directly (which is of type string) without constructing a Maybe<string>.

I also defined a static non-generic Maybe class that has some interesting members:

None: a static property that is implicitly convertible to Maybe<T> for any T. That is, there is an implicit conversion operator defined (see here) that allows it to be converted to Maybe<T> that contains no value. See the GetLogContents from before. The code in some branch returns Maybe.None.

Some: a static method that allows us to create an instance of Maybe<T> that contains a value. Unlike the implicit operator, this method throws an exception if the value is null.

The Map method

Maybe<T> has a method called Map. Consider this example:

Maybe<string> str = "hello";
Maybe<int> length = str.Map(x => x.Length);

In this example, we have a Maybe<string> that we convert into a Maybe<int> via the Map method. Map allows us to convert the value inside the Maybe if there is a value. If there is no value, Map simply returns an empty Maybe of the new type. In the example above, we want to get the length of the string inside the Maybe.

The lambda given to the Map method will only be used if there is a value inside the Maybe.

I will show you another example of Map soon.

The Bind method

Consider this example:

static Maybe<int> FindErrorCode(string logContents)
{
    var logLines = logContents.Split(Environment.NewLine, StringSplitOptions.RemoveEmptyEntries);

    return
        logLines
            .FirstOrNone(x => x.StartsWith("Error code: "))
            .Map(x => x.Substring("Error code: ".Length))
            .Bind(x => x.TryParseToInt());
}

This method takes the contents of some log file and tries to find an error code inside it. Some line in the file is expected to contain something like this:

Error code: 981

Some log files might not contain an error and thus might not contain such a line.

First, the method splits the content into lines. Then, it tries to find a line that starts with “Error code: “.

The FirstOrNone method is just like the FirstOrDefault method in LINQ. I defined this method as an extension method over IEnumerable<T>. If there is at least one item in the enumerable, FirstOrNone will return a Maybe<T> that contains the first item. If the enumerable is empty, a Maybe<T> that has no value is returned.

The Map method is used to convert the value inside the Maybe<string> (if there is a value). Here we want to take a substring of the line. More specifically, we want to remove the “Error code: “ part from the line.

Now comes the Bind method. Like Map, Bind is also about converting or transforming the value inside the Maybe.

There is a difference though. Let’s look at the signatures of both these methods:

public Maybe<TResult> Map<TResult>(Func<T, TResult> convert)

public Maybe<TResult> Bind<TResult>(Func<T, Maybe<TResult>> convert)

The difference is in the conversion function. When calling Map, we tell it how to convert T (the original value type) to TResult (the new value type). When calling Bind, the conversion function is expected to return Maybe<TResult>, not TResult.

Let’s look at the signature of the TryParseToInt method used in the example above:

public static Maybe<int> TryParseToInt(this string str)

This method is similar to int.TryParse. It takes a string and tries to parse it into a Maybe<int>. If the string can be parsed into an int, the returned Maybe<int> will contain the result. Otherwise, an empty Maybe<int> is returned.

If we had used Map instead of Bind in the FindErrorCode method above, the type returned would have been Maybe<Maybe<int>>.

This type is not really useful and is hard to work with. Bind simply flattens Maybe<Maybe<TResult>> into Maybe<TResult>. This is why Bind is sometimes called FlatMap.

Why is Maybe called a Monad?

A Monad is a container of something C<T> that defines two functions:

Return: a function that takes a value of type T and gives us a C<T> where C is the type of the container. For example, we can convert 1 into a Maybe<int> by using the Maybe.Some method:

var maybe = Maybe.Some(1);

Bind: a function that takes a C<T> and a function from T to C<TResult> and returns a C<TResult>.

That is, Bind looks like this:

(C<T>, T => C<TResult>) => C<TResult>

In the implementation in the source code, Bind is an instance method. So basically, the C<T> it takes is the instance itself.

There are some rules that a Monad has to follow regarding the Bind function. I am not talking about these rules here because I want to keep this article practical and not theoretical.

IEnumerable<T> is also a Monad.

The Return function for IEnumerable<T> is simply the creation of an array that contains a single item. We can also define a Return method like this:

public static IEnumerable<T> Return<T>(T item)
{
    yield return item;
}

The Bind function for IEnumerable<T> is SelectMany. Consider its signature here (I changed TSource to T to make it easy to read):

public static IEnumerable<TResult> SelectMany<T, TResult>(
    this IEnumerable<T> source,
    Func<T, IEnumerable<TResult>> selector)

It takes an IEnumerable<T> (C<T>) and a function from T to IEnumerable<TResult> (from T to C<TResult>) and returns an IEnumerable<TResult> (C<TResult>).

Using LINQ to work with Maybe (and other Monads)

Consider the following method from the source code:

static void Test4()
{
    var errorDescriptionMaybe =
        GetLogContents(13)
            .Bind(contents => FindErrorCode(contents))
            .Bind(errorCode => GetErrorDescription(errorCode));
} 

I have talked about GetLogContents and FindErrorCode earlier. GetErrorDescription takes an int representing the error code and returns Maybe<string> representing the error description. This method might return an empty Maybe if no error description can be found for the specified error. Here is the definition of this method:

static Maybe<string> GetErrorDescription(int errorCode)
{
    var filename = "c:\\errorCodes\\" + errorCode + ".txt";

    if (File.Exists(filename))
        return File.ReadAllText(filename);

    return Maybe.None;
} 

What the Test4 method does is that it gets the log contents (if any), finds the error code inside the log contents (if any), and finally gets a description of the error code (if any).

Currently, GetErrorDescription only requires access to errorCode because it uses the file system to find the description of the error based on what is stored in the files.

Consider this overload of GetErrorDescription:

static Maybe<string> GetErrorDescription(int errorCode, string logContents)
{
    var logLines = logContents.Split(Environment.NewLine, StringSplitOptions.RemoveEmptyEntries);

    var linePrefix = "Error description for code " + errorCode + ": ";

    return
        logLines
            .FirstOrNone(x => x.StartsWith(linePrefix))
            .Map(x => x.Substring(linePrefix.Length));
}

This method expects to find the error description in the log contents in a special line. For example, a line might contain the following:

Error description for code 534: Database is down!

This GetErrorDescription overload requires the log contents as a parameter. We can pass contents to this method in the following way:

static void Test5()
{
    var errorDescriptionMaybe =
        GetLogContents(13)
            .Bind(contents => FindErrorCode(contents)
                .Bind(errorCode => GetErrorDescription(errorCode, contents)));
}

Notice how the second call to Bind is now nested. I did this to be able to access the contents lambda parameter.

Test4 was not great when it comes to readability. Test5 is even a bit less readable. Imagine if we have 10 operations instead of just 3. That will be even less readable.

Consider this now:

static void Test6()
{
    var errorDescriptionMaybe =
        from contents in GetLogContents(13)
        from errorCode in FindErrorCode(contents)
        from errorDescription in GetErrorDescription(errorCode, contents)
        select errorDescription;
}

Here in Test6, I am using LINQ query syntax to do the same thing as Test5.

Is it more readable now?

To make Maybe work with LINQ, I had to define some Select and SelectMany methods inside Maybe. With this, Select works exactly like Map. SelectMany is similar to Bind but has an extra parameter:

public Maybe<TResult> SelectMany<T2, TResult>(
    Func<T, Maybe<T2>> convert,
    Func<T, T2, TResult> finalSelect)

If the first Maybe value contains a value (T), and the result of calling convert (Maybe<T2>) contains a value (T2), then the finalSelect function is called to compute something from T and T2.

Consider this example:

var customer = from age in Maybe.Some(30)
             from name in Maybe.Some("Adam")
             select new Customer(name, age);

This is translated to:

Maybe<Customer> customer =
    Maybe.Some(30)
        .SelectMany(
            convert: (int age) => Maybe.Some("Adam"),
            finalSelect: (int age, string name) => new Customer(name, age));

In select new Customer(name, age), we need access to both name and age and this is what the finalSelect function gives us. It also allows us to produce a value of a type that is different from the types of the two involved Maybe objects.

I also added a Where method that allows us to use where in a query syntax:

static void Test7()
{
    var errorDescriptionMaybe =
        from contents in GetLogContents(13)
        from errorCode in FindErrorCode(contents)
        where errorCode < 1000
        from errorDescription in GetErrorDescription(errorCode, contents)
        select errorDescription;
}

Where returns an empty Maybe if the value does not meet the condition.

LINQ query syntax was designed to be extensible in a way we just saw. I might talk in details about this in an upcoming article.

Conclusion:

In this article I talked about Maybe; a container that may or may not contain a value.

The most important thing Maybe does is that it allows us to express when something is optional.

I showed two implementations of Maybe; one that uses a class and one that uses a struct. I think using a struct is better because a struct Maybe models exactly two states (has a value and has no value), while a class Maybe models another state which is null.

I talked about the Map and Bind methods and how they allow us to convert/transform the value inside Maybe.

I also talked very briefly about what it means to be a Monad and gave an example of IEnumerable<T> as a Monad.

Finally, I explained how we can use LINQ query syntax to work with Maybe in a more readable way.

Download the entire source code of this article from Github.

This article has been technically reviewed by Damir Arh.

This article has been editorially reviewed by Suprotim Agarwal.

Absolutely Awesome Book on C# and .NET

C# and .NET have been around for a very long time, but their constant growth means there’s always more to learn.

We at DotNetCurry are very excited to announce the The Absolutely Awesome Book on C# and .NET. This is a 500 pages concise technical eBook available in PDF, ePub (iPad), and Mobi (Kindle).

Organized around concepts, this eBook aims to provide a concise, yet solid foundation in C# and .NET, covering C# 6.0, C# 7.0 and .NET Core, with chapters on .NET Standard and the upcoming C# 8.0 too. Use these concepts to deepen your existing knowledge of C# and .NET, to have a solid grasp of the latest in C# and .NET OR to crack your next .NET Interview.

Click here to Explore the Table of Contents or Download Sample Chapters!

What Others Are Reading!
Was this article worth reading? Share it with fellow developers too. Thanks!
Share on LinkedIn
Share on Google+

Author
Yacoub Massad is a software developer that works mainly with Microsoft technologies. Currently, he works at Zeva International where he uses C#, .NET, and other technologies to create eDiscovery solutions. He is interested in learning and writing about software design principles that aim at creating maintainable software. You can view his blog posts at criticalsoftwareblog.com


Page copy protected against web site content infringement 	by Copyscape




Feedback - Leave us some adulation, criticism and everything in between!

Categories

JOIN OUR COMMUNITY

POPULAR ARTICLES

C# .NET BOOK

C# Book for Building Concepts and Interviews

Tags

JQUERY COOKBOOK

jQuery CookBook