There’s no unanimous opinion on what a cloud application really is. For the purpose of this article, I’m going to use the term to describe applications which are developed with the intention to be hosted on the cloud.
Although often, another term is used to name such applications: cloud-native applications.
Cloud-native applications
The official definition of cloud-native applications comes from the Cloud Native Computing Foundation (CNCF):
Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach.
These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.
It’s important to notice that the definition not only describes the internal architecture of cloud-native applications (microservices, declarative APIs), but also the way these applications are deployed (containers, service meshes, immutable infrastructure) and maintained (robust automation, frequent high-impact changes).
In this tutorial, I’m going to focus on the internal application architecture of Cloud-Native Applications.
For a more high-level overview of microservices and their deployment, you can read the Microservices Architecture Pattern article by Subodh Sohoni.
The architecture of cloud-native applications or microservices has many similarities to the Architecture of Web Applications which I wrote about in my previous article from this series: Architecture of Web Applications.
There are two main differences:
- The web applications as described in my previous article have their own user interface. The interface of cloud native-applications are predominantly APIs (typically REST based).
- Cloud-applications are implemented as a collection of services; each one of them running as a separate process and usually also in its own isolated environment. In contrast to that, monolithic web applications run as a single process (unless multiple instances are deployed for the purpose of load balancing).
The scope and size of these individual services varies across projects, but it is not uncommon for them to be as small as possible, limited to a well-defined bounded subset of the full application functionality. That’s why the term microservices is often used for them.
The internal architecture of a single microservice is very similar to the internal architecture of a monolithic web application. They both include all parts of a typical multilayered application:
- Presentation layer (web page or API)
- Business logic layer
- Data access layer
Figure 1: Layers in monolithic and microservices architecture
It’s important that each microservice has not only its own data access code but also its own separate data storage (relational database or other). This not only allows it complete control over the data but also increases the need for communication between individual microservices.
From an architectural point of view, these properties are more important than the size of the microservice itself. That’s why I’ll simply use the term service for the rest of the article.
Communication between services
When one service needs access to data that’s in the domain of another service, it can’t simply read it from the common data store (because there is none), nor does it have direct access to the data store of another service. The only way to get to that data is through communication between the two services.
Synchronous communication
The most intuitive way for two services to communicate is most likely the request-response pattern. In this case, the communication between the services consists of two messages:
- The request sent by the calling service to the called service.
- The response sent back by the called service to the calling service.
The interaction is fully synchronous: the calling service waits for a response from the called service before it can continue its processing.
Figure 2: Request response pattern
The most common implementation of the request-response pattern is the HTTP protocol: the client (usually the browser) sends a request to the server (usually the web server) and waits for the response before it can render it.
RESTful services use the same protocol: the client (usually an application) sends an HTTP request to the server (usually a web service). When services expose their functionality as a (RESTful) API they can also use the same approach for communicating between each other: one service acts as the client and another service acts as the server.
In the .NET ecosystem, the recommended stack for implementing RESTful services is ASP.NET Core Web API.
The framework is very similar to ASP.NET Core MVC (and not just by its name). Individual endpoints are implemented as action methods in controllers which group multiple action methods with the same base URL (and usually also related functionality):
[ApiController]
[Route("[controller]")]
public class WeatherForecastController : ControllerBase
{
private static readonly string[] Summaries = new[]
{
"Freezing", "Bracing", "Chilly", "Cool", "Mild", "Warm", "Balmy", "Hot", "Sweltering", "Scorching"
};
[HttpGet]
public IEnumerable<WeatherForecast> Get()
{
var rng = new Random();
return Enumerable.Range(1, 5).Select(index => new WeatherForecast
{
Date = DateTime.Now.AddDays(index),
TemperatureC = rng.Next(-20, 55),
Summary = Summaries[rng.Next(Summaries.Length)]
})
.ToArray();
}
}
The main difference is that there is no view in RESTful services. The response is a DTO (data transfer object) that’s automatically serialized as JSON (JavaScript Object Notation), and deserialized back again on the client. If needed, the serialization and deserialization process can be customized through serialization options.
The available endpoints and the structure of requests and responses for each one of them can be described using the OpenAPI specification. Swashbuckle is the most common library that’s used with ASP.NET Core Web API to generate OpenAPI descriptions of RESTful APIs.
When implementing the ASP.NET Core Web API action methods, the same patterns can be used as for ASP.NET Core MVC applications: Dependency Injection, Repository, Unit of work, etc.
I described them in more detail in my previous article from the series: Architecture of Web Applications. When calling the RESTful APIs from another service, the same remote proxy pattern can be used as described in another article from this series: Architecting .NET Desktop and Mobile applications. Other useful patterns Retry and Circuit breaker are covered later in this article.
Although RESTful services are the most common API implementation today, they are not the only way to implement an API using the request response pattern.
The predecessor of RESTful services is SOAP web services. They were different from RESTful services in many ways:
In .NET Core, there’s no framework or library available for implementing SOAP services. In .NET framework, WCF (Windows Communication Foundation) was used for that purpose but it wasn’t fully ported to .NET Core (nor are there any plans for that).
There’s only a limited client for SOAP services available which doesn’t support all the protocols. For implementing the services, the only option is Core WCF – a far from complete open-source WCF implementation for .NET Core.
The main disadvantages of SOAP were large messages and incomplete support for all extensions on different development platforms which made cross-platform compatibility difficult. It doesn’t make much sense to develop new SOAP services today.
A better alternative for RPC-based APIs is gRPC which doesn’t have either of the above-mentioned disadvantages of SOAP services:
.NET Core provides full support for gRPC services and clients. The service interface and messages are described using a protocol buffers’ .proto file:
syntax = "proto3";
option csharp_namespace = "GrpcService";
package greet;
service Greeter {
rpc SayHello (HelloRequest) returns (HelloReply);
}
message HelloRequest {
string name = 1;
}
message HelloReply {
string message = 1;
}
The file is used as an input to autogenerate a base class for the service implementation. The service class derives from that class and implements the service methods:
public class GreeterService : Greeter.GreeterBase
{
public override Task<HelloReply> SayHello(HelloRequest request, ServerCallContext context)
{
return Task.FromResult(new HelloReply
{
Message = "Hello " + request.Name
});
}
}
For the client, the Grpc.Net.Client, Google.Protobuf, and Grpc.Tools NuGet packages must be installed in the project. The latter will autogenerate a remote proxy from the service .proto file which can then be used from the application code:
var channel = GrpcChannel.ForAddress("https://localhost:5001");
var client = new Greeter.GreeterClient(channel);
var response = client.SayHello(new HelloRequest
{
Name = "World"
});
To learn more about gRPC in .NET Core, read the gRPC with ASP.NET Core 3.0 article by Daniel Jimenez Garcia.
Asynchronous communication
No matter how efficient the chosen underlying protocol is, Synchronous communication between services has its disadvantages:
- The calling service can’t complete its operation until it gets the response from the called service. Because of network latency, IO operations are always much slower than local processing and should be kept to a minimum.
- The calling service becomes dependent on the called services. If any of them fail, the calling service will fail as well. This will make the whole application less reliable. If only one service stops working, any services depending on it will stop working as well.
Although these are inherent to all distributed systems, switching to asynchronous communication between the services can to some extent help with both these problems because the calling service doesn’t have to wait for the response from the called service anymore.
In the Resilience section later in the article, I will introduce additional patterns for handling challenges of communication between multiple services.
A common pattern for asynchronous communication is publisher-subscriber.
When implemented properly, instead of directly calling another service, the publisher service sends a message to a central event bus or message broker which is typically a separate service with the only responsibility of delivering the messages.
Examples of such services in Azure are Service Bus, Event Grid, and Event Hubs. The other services can subscribe to receive those messages based on their types. The event bus persist messages and guarantees that the subscriber will be delivered the message even if it was busy or not operational at the time when the message was published.
Figure 3: Publisher-subscriber pattern
This approach can be expanded further into event sourcing with event stream processing. In comparison to the publisher-subscriber pattern, the events in this approach are typically more granular and persisted permanently. This allows them to be replayed in full when a new consumer is introduced which processes the same stream of events in a different way to bring new insights into data.
The asynchronous nature of this communication affects the overall behaviour of the application:
- The services aren’t directly dependent on each other anymore. Consequently, the calling service will not fail if the called service isn’t working. The event will still be published, and the calling service will successfully complete its operation. Due to event persistence, the called service will process the event eventually. This drastically reduces the impact of a single failing service on the overall application.
- When the calling service completes its operation, the processing most likely hasn’t been completed yet. It gets processed later when all the subscribers complete their own part of processing as well.
In a real-world application this means that a confirmation for the order could be sent out before it was fully processed. In a worst-case scenario, this could mean that one of the items ordered will be out of stock and the order will not be delivered in full within the original time estimate. To ensure consistency in data stores where the operation has already been processed, a compensating transaction might be needed to update the state according to a failure that occurred later. In more complex scenarios the Saga design pattern can help with orchestrating the transactions for different outcomes.
- The calling service can’t get any data from other services to do its own processing. Since it’s also not supposed to directly access data stores of other services, it must have a local copy of any data it needs. This copy of data will have to be kept in sync with the representative data source.
Client interaction
Of course, client applications (web, mobile, etc.) also need to communicate with several of the services that comprise the application. This communication is synchronous most of the time (clients expect direct response to their requests). APIs exposed as RESTful services are most widely supported across different client technologies.
API gateway
Clients that directly communicate with a multitude of services end up being very closely coupled to them. Many implementation details must be exposed to them for such communication to work; for example, how the load balancing of each service is handled.
To reduce this complexity, an API gateway can be introduced between the clients and the services. This means that the client only communicates with the API gateway which then forwards these calls to the appropriate service. Underlying changes in implementation can often be hidden from the clients. The API gateway can effectively serve as a type of a façade for the services.
Figure 4: API gateway as a façade for services
In addition to simply forwarding the request to the appropriate service, the API gateway will often also be responsible for many cross-cutting concerns, such as authentication, SSL termination, caching, logging, load balancing, etc.
In simple scenarios, a reverse proxy such as NGINX can take the role of an API gateway. For more complex requirements, dedicated solutions are available, such as Azure API Management.
Backend for frontend
The role of the API gateway can be pushed another step further. Instead of simply exposing the APIs of individual services to the clients, these individual APIs can be combined into a tailored API for a specific client, such as a mobile application.
This approach of per-client APIs is described by the backend for frontend (BFF) pattern. Multiple clients will each have its own BFF service although the calls from all of them are in the end handled by the same underlying services.
Figure 5: Each client has a separate backend for frontend
The scope of BFF services can vary. They might only orchestrate APIs of several underlying services, i.e., they call multiple services synchronously, wait for their responses and then combine them in a single response to the client request.
To improve performance, they can include a caching layer. This can be basic response caching for replaying responses to identical request for the time the cache is valid. But it can also include more sophisticated techniques.
A common pattern for more advanced caching of data is materialized view. Instead of caching responses or data received from underlying services, the service has its own data store which can be used to efficiently retrieve requested data, removing the need for calling other services. This data is a duplication of data stored elsewhere and doesn’t represent a source of truth.
Figure 6: Materialized view is updated with data from multiple sources
The important part is, how the data is updated after the initial data synchronization or migration.
This depends on the requirements and how stale the data may be. In an ideal scenario, it will automatically be updated in response to the changes of the source data with minimum delay. This can be achieved using the previously described publisher-subscribe pattern or event sourcing.
The materialized view can be subscribed to all events related to changes of data, that it persists. If other services publish such events for all data, this ensures that the data in the materialized view will be up to date most of the time.
Resilience
Cloud applications are essentially distributed applications. This changes how communication between different parts of application is performed.
Instead of it being predominantly in-process communication, which is inherently reliable, most of the intra-application communication is taking place over network, which is more often subject to failure.
When working on a distributed application for the first time, this aspect is often overlooked due to common false assumptions that were listed as the fallacies of distributed computing by L. Peter Deutsch and James Gosling in the 1990s.
To make the application more resilient to such failures and keep it working, additional measures must be taken. There are proven patterns available that can be used.
Retry and exponential backoff
The Retry pattern’s main concern is how to respond to a failed network call.
Of course, the caller can always immediately give up and propagate the failure as a response back to its caller. However, there is a possibility that the failure was transient, i.e., the next identical call will succeed. In that case, it makes sense to retry the request before giving up. This quickly raises new questions:
- How many times should the call be retried before finally giving up?
- How long should be the delay between the retries?
There is no universal answer to these questions.
The best approach strongly depends on the specific request and the type of the error returned. The latter should make it clear at least whether the failure was transient or not.
A special category of errors are timeouts because they don’t necessarily mean that the called service has given up on processing. It could be that a client in the call chain has simply given up on waiting. This means that the request might still be successfully completed. This makes it particularly dangerous to repeat the request if it isn’t idempotent, i.e. the end state will be different if it’s executed multiple times.
In general, a common approach to handling delays between retried requests is exponential backoff, i.e., the delay between the requests is increased exponentially. This ensures that the delay will be minimal if it was indeed a transient failure and the second request will succeed.
On the other hand, if the failure persists for a longer time, it prevents the called service from being overwhelmed by the increasing rate of repeated requests from multiple clients. Of course, after a certain threshold is reached, the caller will give up retrying and report a failure itself as well.
In the .NET ecosystem, the most popular library for handling transient failures is Polly. It revolves around creating policies for handling individual types of exceptions:
var retryPolicy = Policy
.Handle<HttpRequestException>()
.Retry();
This policy can then be used when executing an action that could throw such an exception:
var result = retryPolicy.Execute(() => CallService());
The policy above will simply immediately retry a single time after a matching exception (considered to be transient) is thrown. But it’s just as easy to create a policy for multiple retries with exponential backoff:
var retryPolicy = Policy
.Handle<HttpRequestException>()
.WaitAndRetry(new[]
{
TimeSpan.FromSeconds(1),
TimeSpan.FromSeconds(2),
TimeSpan.FromSeconds(4)
});
Instead of explicitly specifying the delays, they can also be calculated:
var retryPolicy = Policy
.Handle<HttpRequestException>()
.WaitAndRetry(5, retryAttempt =>
TimeSpan.FromSeconds(Math.Pow(2, retryAttempt)));
The library makes it very easy to define detailed policies matching specific requirements. It’s a great alternative to implementing such behaviour manually.
Circuit Breaker
The Retry pattern is localized to a single request.
This means that if a service is sending multiple requests to a single service, each one of them is handled independently of the others. However, if a request to a service is failing, it’s very likely that other requests to the same service will be failing as well.
This is where other patterns come into play.
The Circuit Breaker pattern acts as a common proxy for all requests to a particular service. It monitors for failing requests and based on preconfigured rules, transitions between three states:
- Closed: The called service is operating normally. All requests are passed to it.
- Open: The called service is currently failing. Any requests to it are not passed to the service and fail immediately.
- Half-open: After a certain period in the open state, the requests to the service are again passed to it. However, the tolerance for failure is reduced. If any requests fail, the state will switch back to open. Only after the service seems to operate normally for some time, the circuit breaker returns into the closed state.
Figure 7: State diagram for circuit breaker pattern
A pattern like circuit breaker can significantly reduce the number of requests sent to a service with transient issues. This can be helpful in its recovery as it isn’t overwhelmed with incoming requests which it can’t handle.
As for the retry pattern, the Polly library includes an easy-to-use implementation of the circuit breaker pattern:
var circuitBreakerPolicy = Policy
.Handle<HttpRequestException>()
.CircuitBreaker(2, TimeSpan.FromMinutes(1));
It makes perfect sense to combine both patterns: retry the requests as needed, but also track the failures and eventually switch the circuit breaker to open state. The library supports that as well:
var combinedPolicy = Policy
.Wrap(retryPolicy, circuitBreakerPolicy);
Using appropriate error handling mechanisms can noticeably contribute to the overall reliability of a distributed cloud application.
Conclusion
The article covers some of the patterns that are especially useful in distributed cloud applications.
It starts with an overview of different approaches to communication between services, both synchronous and asynchronous. It continues with some specifics of communication between clients and the services, introducing the API gateway and backend for frontend patterns.
The final part addresses the issue for reliability and transient failures in cloud environments. It introduces the retry and circuit breaker patterns as useful tools for dealing with them.
This article was technically reviewed by Daniel Jimenez Garcia.
This article has been editorially reviewed by Suprotim Agarwal.
C# and .NET have been around for a very long time, but their constant growth means there’s always more to learn.
We at DotNetCurry are very excited to announce The Absolutely Awesome Book on C# and .NET. This is a 500 pages concise technical eBook available in PDF, ePub (iPad), and Mobi (Kindle).
Organized around concepts, this Book aims to provide a concise, yet solid foundation in C# and .NET, covering C# 6.0, C# 7.0 and .NET Core, with chapters on the latest .NET Core 3.0, .NET Standard and C# 8.0 (final release) too. Use these concepts to deepen your existing knowledge of C# and .NET, to have a solid grasp of the latest in C# and .NET OR to crack your next .NET Interview.
Click here to Explore the Table of Contents or Download Sample Chapters!
Was this article worth reading? Share it with fellow developers too. Thanks!
Damir Arh has many years of experience with software development and maintenance; from complex enterprise software projects to modern consumer-oriented mobile applications. Although he has worked with a wide spectrum of different languages, his favorite language remains C#. In his drive towards better development processes, he is a proponent of Test-driven development, Continuous Integration, and Continuous Deployment. He shares his knowledge by speaking at local user groups and conferences, blogging, and writing articles. He is an awarded Microsoft MVP for .NET since 2012.