How does a C# LINQ Query execute and How does this affect Performance?

Posted by: Damir Arh , on 2/22/2019, in Category LINQ
Views: 65463
Abstract: The way LINQ queries are executed depends on the kind of data source being queried. This tutorial talks about how a LINQ query is executed and how this affects performance.

The key feature of LINQ is its universal querying API independent of the target data source. However, the way LINQ queries are executed, depends on the kind of data source being queried.

When querying local in-memory collections (commonly called LINQ to Objects), the LINQ extension methods for the IEnumerable<T> interface are used. The implementation of the extension methods somewhat depends on whether the method returns a value typed as IEnumerable<T>, or a scalar value.

The methods returning an IEnumerable<T> execute in a deferred way, i.e. at the point in code where the result is retrieved, not when the query is defined:

var list = new List<int> { 1, 2, 3, 4, 5 };
var query = list.Where(item => item < 3); // not executed yet

It’s not too difficult to implement such an extension method ourselves:

public static IEnumerable<T> Where<T>(this IEnumerable<T> list, Func<T, bool> predicate)
    foreach (var item in list)
        if (predicate(item))
            yield return item;

For a Where method, we only need to iterate through all the items using a foreach loop and return those items which satisfy the supplied predicate.

To achieve deferred execution, the method is implemented as an iterator, i.e. the values are returned using the yield return statement. Whenever this statement is reached, the control of code execution is transferred back to the caller performing the iteration until the next item from the returned IEnumerable<T> is requested.

The methods returning a scalar value are in a way even simpler because they execute immediately:

var list = new List<int> { 1, 2, 3, 4, 5 };
var sum = list.Sum(); // executes immediately

Such extension methods don’t even need to be iterators:

public static int Sum(this IEnumerable<int> list)
    var sum = 0;
    foreach (var item in list)
        sum += item;
    return sum;

For a Sum method, we need to iterate through the items with a foreach loop to calculate the total sum returned in the end.

The fact that LINQ to Objects extension methods accept IEnumerable<T> as their first argument makes them useful not only for querying collection classes, but in other scenarios as well. One such example is querying of XML documents, also known as LINQ to XML.

XML documents must still be loaded in memory when they are queried, but they are not modelled with standard collection classes. Instead, a root XDocument class is used to represent an XML document. Usually the XML document will be read from a file or parsed from an XML string:

var xmlDoc = XDocument.Parse(xmlString);

The Root property will contain an instance of XElement, corresponding to the root element of the XML document. This class provides many methods for accessing the other elements and attributes in the document. The most commonly used among them are probably:

  • Elements, returning the direct child elements as an instance of IEnumerable<XElement>
  • Descendants, returning direct and indirect child elements as an instance of IEnumerable<XElement>
  • Attributes, returning the attributes of the element as an instance of IEnumerable<XAttribute>

Since they all return IEnumerable<T>, they can easily be queried using LINQ. They don’t provide strong type checking for element and argument names, though. Strings are used instead:

var elements = xmlDoc.Root.Elements()
    .Where(element => int.Parse(element.Attribute("age").Value) > 21);

For a certain structure of the XML, the above query would return all person elements with the age attribute above a specific value. Here’s an example of such a document:

  <person name="John" surname="Doe" age="33" />
  <person name="Jane" surname="Doe" age="42" />
  <person name="Johnny" surname="Doe" age="13" />

To use LINQ to XML, we don’t need to concern ourselves with the parsing of the XML documents. Since elements and attributes are exposed as instances of IEnumerable<T>, we can use the LINQ extension methods on them in combination with the properties available on the XElement and XAttribute classes.

The abstraction of underlying data source extends over to LINQ to Entities as well. This term represents querying of external databases using Entity Framework or Entity Framework Core. Although the data is not queried locally in this case, the querying experience is almost identical.

Sample code from here on will be based on Entity Framework Core. To use it with Microsoft SQL Server, the Microsoft.EntityFrameworkCore.SqlServer NuGet package must be installed in your project.

Download the source code of this article (Github)

To specify the mapping between the code and the database tables, corresponding properties must be defined in a class deriving from the DbContext class. Entity Framework will take care of the connectivity to the remote database:

public class PersonContext : DbContext
    public PersonContext(DbContextOptions<PersonContext> options) 
        : base(options)
    { }

    // maps to the Persons table in database
    public DbSet<Person> Persons { get; set; }

Rows in each database table are represented by a simple DTO (data transfer object) matching the table structure:

public class Person
    public int Id { get; set; }
    public string Name { get; set; }
    public string Surname { get; set; }
    public int Age { get; set; }

The connection string for the database can be specified in a DbContextOptions<T> instance:

var optionsBuilder = new DbContextOptionsBuilder<PersonContext>();
var options = optionsBuilder.Options;

With this set up, a database table can be queried very similar to an XML document or a local collection:

using (var context = new PersonContext(options))
    var query = context.Persons.Where(person => person.Age > 21);

Because the DbSet<T> collections implement IQueryable<T>, not just IEnumerable<T>, the extension methods on the former interface are used instead. The IQueryable<T> interface makes it possible to implement a LINQ query provider which takes care of querying the data remotely. In our case, the provider is part of Entity Framework Core.

When writing queries, most of the time we aren’t aware of the difference between the two interfaces because the same set of extension methods is available for both of them. They are just implemented differently.

For the above query, the LINQ query provider for SQL Server would send the following SQL query to the server:

SELECT [person].[Id], [person].[Age], [person].[Name], [person].[Surname]
FROM [Persons] AS[person]
WHERE [person].[Age] > 21

At least for simple cases when we are querying the data, it’s not important what SQL query is generated and how. We can safely assume that the data will be retrieved correctly and in an efficient manner. We can even use certain common .NET methods in the query predicates:

var query = context.Persons.Where(person => Math.Abs(person.Age) > 21);

The LINQ provider will still generate a matching SQL query:

SELECT [person].[Id], [person].[Age], [person].[Name], [person].[Surname]
FROM [Persons] AS[person]
WHERE ABS([person].[Age]) > 21

However, this will only work for functions which have built-in equivalents in the target database and can therefore be implemented in such a way by the LINQ provider. If we start using our own custom methods, they of course won’t be converted correctly any more:

private static int CustomFunction(int val)
    return val + 1;

var query = context.Persons.Where(person => CustomFunction(person.Age) > 21);

In this case, the following SQL query will be sent to the database:

SELECT [person].[Id], [person].[Age], [person].[Name], [person].[Surname]
FROM [Persons] AS[person]

Although all the rows from the Persons table will be retrieved, the LINQ query will still return the correct result. The filtering will be done locally after the data is retrieved. The results are still correct, but the performance will suffer when the number of rows in the table increases. Because of that, we should always make the necessary tests to discover such problems during development.

There are many more types of data sources which can be queried using LINQ. A large collection of third party LINQ providers are published on NuGet for querying different types of relational (NHibernate) and non-relational (MongoDB.Driver) databases, REST services (Linq2Rest), and more.

Depending on how they are implemented, there might be differences in how much of the query is executed remotely and which parts of it are processed locally after the data is received. Therefore it’s important to always read the documentation and test the functionality and performance of the final code.

Download the source code of this article (Github)

This article was technically reviewed by Yacoub Massad.

This article has been editorially reviewed by Suprotim Agarwal.

Absolutely Awesome Book on C# and .NET

C# and .NET have been around for a very long time, but their constant growth means there’s always more to learn.

We at DotNetCurry are very excited to announce The Absolutely Awesome Book on C# and .NET. This is a 500 pages concise technical eBook available in PDF, ePub (iPad), and Mobi (Kindle).

Organized around concepts, this Book aims to provide a concise, yet solid foundation in C# and .NET, covering C# 6.0, C# 7.0 and .NET Core, with chapters on the latest .NET Core 3.0, .NET Standard and C# 8.0 (final release) too. Use these concepts to deepen your existing knowledge of C# and .NET, to have a solid grasp of the latest in C# and .NET OR to crack your next .NET Interview.

Click here to Explore the Table of Contents or Download Sample Chapters!

What Others Are Reading!
Was this article worth reading? Share it with fellow developers too. Thanks!
Share on LinkedIn
Share on Google+

Damir Arh has many years of experience with software development and maintenance; from complex enterprise software projects to modern consumer-oriented mobile applications. Although he has worked with a wide spectrum of different languages, his favorite language remains C#. In his drive towards better development processes, he is a proponent of Test-driven development, Continuous Integration, and Continuous Deployment. He shares his knowledge by speaking at local user groups and conferences, blogging, and writing articles. He is an awarded Microsoft MVP for .NET since 2012.

Page copy protected against web site content infringement 	by Copyscape

Feedback - Leave us some adulation, criticism and everything in between!