DotNetCurry Logo

Concurrent Programming in .NET Core

Posted by: Damir Arh , on 4/13/2017, in Category C#
Views: 28101
Abstract: Learn approaches to concurrent programming in .NET Core, as well as potential issues to be aware of.

Every computer that we buy today has a CPU with more than one core, allowing it to execute multiple instructions in parallel. Operating systems take advantage of this configuration by scheduling processes to different cores.

However, concurrency can also help us improve performance of individual applications with asynchronous I/O operations and parallel processing.

In .NET Core, tasks are the main abstraction for concurrent programming, but there other support classes that can make our job easier.

Concurrent programming -Asynchronous vs. Multithreaded Code

Concurrent programming is a broad term and we should start with it by examining the difference between asynchronous methods and actual multithreading.

Although .NET Core uses Task to represent both concepts, there is a core difference in how it handles them internally.

Asynchronous methods run in the background while the calling thread is doing other work. This means that these methods are I/O bound, i.e. they spend most of their time in input and output operations, such as file or network access.

Whenever possible, it makes a lot of sense to use asynchronous I/O methods in favor of synchronous ones. In the meantime, the calling thread can handle user interaction in a desktop application or process other requests in a server application, instead of just idly waiting for the operation to complete.

You can read more about calling asynchronous methods using async and await in my Asynchronous Programming in C# using Async Await – Best Practices article for the September edition of the DNC Magazine.

CPU-bound methods require CPU cycles to do their work and can only run in the background using their own dedicated thread. The number of available CPU cores, limits the number of threads that can run in parallel. The operating system is responsible for switching between the remaining threads, giving them a chance to execute their code.

These methods still run concurrently, but not necessarily in parallel. This means that although the methods do not execute at the same time, one method can still execute in the middle of the other, which is paused during that time.

parallel-vs-concurrent-dotnet-core

Figure 1: Parallel versus concurrent execution

This article will focus on multithreaded concurrent programming in .NET Core as described in the last paragraph.

Task Parallel Library

.NET Framework 4 introduced Task Parallel Library (TPL) as the preferred set of APIs for writing concurrent code. The same programming model is adopted by .NET Core.

To run a piece of code in the background, you need to wrap it into a task:

var backgroundTask = Task.Run(() => DoComplexCalculation(42));
// do other work
var result = backgroundTask.Result;

Task.Run method accepts a Func if it needs to return a result, or an Action if it does not return any result. Of course, in both cases you can use a lambda, just as I did in the example above to invoke the long running method with a parameter.

A thread from the thread pool will process the task. The .NET Core runtime includes a default scheduler that takes care of queuing and executing the tasks using the thread pool threads. You can implement your own scheduling algorithm by deriving from the TaskScheduler class and using it instead of the default scheduler, but this discussion is beyond the scope of this article.

In the sample we just saw, I accessed the Result property to merge the background thread back into the calling thread. For tasks that do not return a result, I could call Wait() instead. Both will block the calling method until the background task completes.

To avoid blocking the calling thread (i.e. in an ASP.NET Core application), you can use the await keyword instead:

var backgroundTask = Task.Run(() => DoComplexCalculation(42));
// do other work
var result = await backgroundTask;

By doing so, the calling thread will be released to process other incoming requests. Once the task completes, an available worker thread will resume processing the request. Of course, the controller action method must be asynchronous for this to work:

public async Task< iactionresult > Index() {     // method body }

Handling Exceptions

Any exceptions thrown by the task will propagate to the calling thread at the point of merging the two threads back together:

  • If you use Result or Wait(), they will be wrapped into an AggregateException. The actual exception thrown will be stored in its InnerException property.
  • If you use await, the original exception will remain unwrapped.

In both cases, the call stack information will remain intact.

Cancelling a Task

Since tasks can be long running, you might want to have an option for cancelling them prematurely. To allow this option, pass a cancellation token when creating the task. You can use it afterwards to trigger the cancellation:

var tokenSource = new CancellationTokenSource();
var cancellableTask = Task.Run(() =>
{
    for (int i = 0; i < 100; i++)
    {
        if (tokenSource.Token.IsCancellationRequested)
        {
            // clean up before exiting
            tokenSource.Token.ThrowIfCancellationRequested();
        }
        // do long-running processing
    }
    return 42;
}, tokenSource.Token);
// cancel the task
tokenSource.Cancel();
try
{
    await cancellableTask;
}
catch (OperationCanceledException e)
{
    // handle the exception
}

To actually stop the task early, you need to check the cancellation token in the task and react if a cancellation was requested: do any clean up you might need to do and then call ThrowIfCancellationRequested() to exit the task. This will throw an OperationCanceledException, which can then be handled accordingly in the calling thread.

Coordinating Multiple Tasks

If you need to run more than one background task, there are methods available to help you coordinate them.

To run multiple tasks concurrently, just start them consecutively and collect references to them, e.g. in an array:

var backgroundTasks = new []
{
    Task.Run(() => DoComplexCalculation(1)),
    Task.Run(() => DoComplexCalculation(2)),
    Task.Run(() => DoComplexCalculation(3))
};

Now you can use static helper methods of the Task class to wait for their execution to complete synchronously or asynchronously:

// wait synchronously
Task.WaitAny(backgroundTasks);
Task.WaitAll(backgroundTasks);
// wait asynchronously
await Task.WhenAny(backgroundTasks);
await Task.WhenAll(backgroundTasks);

The two methods at the bottom actually return a task themselves, which you can once again manipulate like any other task. To get the task results, you can inspect the Result property of the original tasks.

Handling exceptions when working with multiple tasks is a bit trickier. WaitAll and WhenAll methods will throw an exception, whenever any of the tasks in the collection have thrown. However, while WaitAll’s AggregateException will contain all the thrown collections in its InnerExceptions property, WhenAll will only throw the first exception thrown by any of the tasks. In order to determine which task has thrown which exception, you will need to check Status and Exception properties of each individual task.

You need to be even more careful when using WaitAny and WhenAny. Both of them wait for the first task to complete (successfully or not), but do not throw an exception even if the task has thrown one. They only return the index of the completed task or the completed task itself, respectively. You will need to catch the exception when awaiting the completed task or when accessing its result, e.g.:

var completedTask = await Task.WhenAny(backgroundTasks);
try
{
    var result = await completedTask;
}
catch (Exception e)
{
    // handle exception
}

If you want to run multiple tasks consecutively instead of in parallel, you can use continuations:

var compositeTask = Task.Run(() => DoComplexCalculation(42))
    .ContinueWith(previous => DoAnotherComplexCalculation(previous.Result), 
        TaskContinuationOptions.OnlyOnRanToCompletion)

The ContinueWith() method allows you to chain multiple task to be executed one after another. The continuing task gets a reference to the previous task to use its result or to check its status. You can also add a condition to control when to run the continuation, e.g. only when the previous task completed successfully or when it threw an exception. This adds flexibility in comparison to consecutively awaiting multiple tasks.

Of course, you can combine continuations with all the previously discussed features: exception handling, cancellation and running tasks in parallel. This gives you a lot of expressive power to combine the tasks in different ways:

var multipleTasks = new[]
{
    Task.Run(() => DoComplexCalculation(1)),
    Task.Run(() => DoComplexCalculation(2)),
    Task.Run(() => DoComplexCalculation(3))
};
var combinedTask = Task.WhenAll(multipleTasks);

var successfulContinuation = combinedTask.ContinueWith(task =>
        CombineResults(task.Result), TaskContinuationOptions.OnlyOnRanToCompletion);
var failedContinuation = combinedTask.ContinueWith(task =>
        HandleError(task.Exception), TaskContinuationOptions.NotOnRanToCompletion);

await Task.WhenAny(successfulContinuation, failedContinuation);

Task Synchronization

If tasks are completely independent, the methods we just saw for coordinating them will suffice. However, as soon as they need to access shared data concurrently, additional synchronization is required in order to prevent data corruption.

Whenever two or more threads attempt to modify a data structure in parallel, data can quickly become inconsistent. The following snippet of code is one such example:

var counters = new Dictionary< int, int >();

if (counters.ContainsKey(key))
{
    counters[key] ++;
}
else
{
    counters[key] = 1;
}

When multiple threads execute the above code in parallel, a specific execution order of instructions in different threads can cause the data to be incorrect, e.g.:

  • Both threads check the condition for the same key value when it is not yet present in the collection.
  • As a result, they both enter the else block and set the value for this key to 1.
  • Final counter value will be 1 instead of 2, which would be the expected result if the threads would execute the same code consecutively.

Such blocks of code, which may only be entered by one thread at a time, are called critical sections. In C#, you can protect them by using the lock statement:

var counters = new Dictionary< int, int >();

lock (syncObject)
{
    if (counters.ContainsKey(key))
    {
        counters[key]++;
    }
    else
    {
        counters[key] = 1;
    }
}

For this approach to work, all threads must share the same syncObject as well. As a best practice, syncObject should be a private Object instance that is exclusively used for protecting access to a single critical section and cannot be accessed from outside.

The lock statement will allow only one thread to access the block of code inside it. It will block the next thread trying to access it until the previous one exits it. This will ensure that a thread will execute the complete critical section of code without interruptions by another thread. Of course, this will reduce the parallelism and slow down the overall execution of code, therefore you will want to minimize the number of critical sections and to make them as short as possible.

The lock statement is just a shorthand for using the Monitor class:

var lockWasTaken = false;
var temp = syncObject;
try
{
    Monitor.Enter(temp, ref lockWasTaken);
    // lock statement body
}
finally
{
    if (lockWasTaken)
    {
        Monitor.Exit(temp);
    }
}

Although most of the time you will want to use the lock statement, Monitor class can give you additional control when you need it. For example, you can use TryEnter() instead of Enter() and specify a timeout to avoid waiting indefinitely for the lock to release.

Other Synchronization Primitives

Monitor is just one of the many synchronization primitives in .NET Core. Depending on the scenario, others might be more suitable.

Mutex is a more heavyweight version of Monitor that relies on the underlying operating system. This allows it to synchronize access to a resource not only on thread boundaries, but even over process boundaries. Monitor is the recommended alternative over Mutex for synchronization inside a single process.

SemaphoreSlim and Semaphore can limit the number of concurrent consumers of a resource to a configurable maximum number, instead of to only a single one, as Monitor does. SemaphoreSlim is more lightweight than Semaphore, but restricted to only a single process. Whenever possible you should use SemaphoreSlim instead of Semaphore.

ReaderWriterLockSlim can differentiate between two different types of access to a resource. It allows unlimited number of readers to access the resource in parallel, and limits writers to a single access at a time. It is great for protecting resources that are thread safe for reading, but require exclusive access for modifying data.

AutoResetEvent, ManualResetEvent and ManualResetEventSlim will block incoming threads, until they receive a signal (i.e. a call to Set()). Then the waiting threads will continue their execution. AutoResetEvent will only allow one thread to continue, before blocking again until the next call to Set(). ManualResetEvent and ManualResetEventSlim will not start blocking threads again, until Reset() is called. ManualResetEventSlim is the recommended more lightweight version of the two.

Interlocked provides a selection of atomic operations that are a better alternative to locking and other synchronization primitives, when applicable:

// non-atomic operation with a lock
lock (syncObject)
{
    counter++;
}
// equivalent atomic operation that doesn't require a lock
Interlocked.Increment(ref counter);

Concurrent Collections

When a critical section is required only to ensure atomic access to a data structure, a specialized data structure for concurrent access might be a better and more performant alternative. For example, by using ConcurrentDictionary instead of Dictionary, the lock statement example can be simplified:

var counters = new ConcurrentDictionary< int, int >();

counters.TryAdd(key, 0);
lock (syncObject)
{
    counters[key]++;
}

Naively, one might even want to use the following:

counters.AddOrUpdate(key, 1, (oldKey, oldValue) => oldValue + 1);

However, the update delegate in the above method is executed outside the critical section. Therefore a second thread could still read the same old value as the first thread, before the first one has updated it, effectively overwriting the first thread’s update with its own value and losing one increment. Even concurrent collections are not immune to multithreading issues when used incorrectly.

Another alternative to concurrent collections, is immutable collections.

Similar to concurrent collections they are also thread safe, but the underlying implementation is different. Any operations that change the data structures do not modify the original instance. Instead, they return a changed copy and leave the original instance unchanged:

var original = new Dictionary< int, int >().ToImmutableDictionary();
var modified = original.Add(key, value);

Because of this, any changes to the collection in one thread are not visible to the other threads, as they still reference the original unmodified collection, which is the very reason why immutable collections are inherently thread safe.

Of course, this makes them useful for a different set of problems. They work best in cases, when multiple threads require the same input collection and then modify it independently, potentially with a final common step that merges the changes from all the threads. With regular collections, this would require creating a copy of the collection for each thread in advance.

Parallel LINQ (PLINQ)

Parallel LINQ (PLINQ) is an alternative to Task Parallel Library. As the name suggests, it heavily relies on LINQ (Language Integrated Query) feature, LINQ to Objects to be exact. As such, it is useful in scenarios, when the same expensive operation needs to be performed on a large collection of values. Unlike ordinary LINQ to Objects, which performs all the operations in sequence, PLINQ can execute these operations in parallel on multiple CPU cores.

To take advantage of that, the code changes are minimal:

// sequential execution
var sequential = Enumerable.Range(0, 40)
    .Select(n => ExpensiveOperation(n))
    .ToArray();

// parallel execution
var parallel = Enumerable.Range(0, 40)
    .AsParallel()
    .Select(n => ExpensiveOperation(n))
    .ToArray();

As you can see, the only difference between the two snippets of code is a call to AsParallel(). This converts an IEnumerable to ParallelQuery, causing the rest of the query to be run in parallel. To switch back to sequential execution in the middle of the query, you can call AsSequential(), which will return an IEnumerable again.

By default, PLINQ does not preserve the order of the items in the collection to make the process more efficient. However, you can change that by calling AsOrdered(), when the order is important:

var parallel = Enumerable.Range(0, 40)
    .AsParallel()
    .AsOrdered()
    .Select(n => ExpensiveOperation(n))
    .ToArray();

Again, you can switch back by calling its counterpart: AsUnordered().

Concurrent Programming in Full .NET Framework

Since .NET Core is a stripped-down reimplementation of the full .NET framework, all the described approaches to concurrent programming in .NET Core are also available in .NET framework. The only exception to this are immutable collections, which are not an integral part of the full .NET framework. They are distributed as a separate NuGet package, System.Collections.Immutable, which you need to install in the project to make them available.

Conclusion:

Whenever your application contains CPU intensive code which can run in parallel, it makes sense to take advantage of concurrent programming to improve performance and use the hardware more efficiently.

The APIs in .NET Core abstract away many details and make writing concurrent code much easier. Still, there are potential issues to be aware of; most of them related to accessing shared data from multiple threads.

If you can, you should avoid such situations altogether. When you cannot, make sure to select a synchronization method or data structure that is the most appropriate for your case.

Was this article worth reading? Share it with fellow developers too. Thanks!
Share on LinkedIn
Share on Google+
Further Reading - Articles You May Like!
Author
Damir Arh has many years of experience with Microsoft development tools; both in complex enterprise software projects and modern cross-platform mobile applications. In his drive towards better development processes, he is a proponent of test driven development, continuous integration and continuous deployment. He shares his knowledge by speaking at local user groups and conferences, blogging, and answering questions on Stack Overflow. He is an awarded Microsoft MVP for .NET since 2012.


Page copy protected against web site content infringement 	by Copyscape




Feedback - Leave us some adulation, criticism and everything in between!