Entity Framework Core 5 — Split Queries

Henrique Siebert Domareski
4 min readApr 10, 2021

A new feature on Entity Framework Core 5, it’s the possibility to split a query into multiple SQL queries. On EF Core 3.0 for example, a single query was generated for each LINQ query, but now on EF Core 5, there is the possibility to split the query. In this article, I will explain the difference between them.

To show the differences between the default query and the split query, I’ve created a console application using .NET Core 5.0, and I’ve installed the packages:

  • Microsoft.EntityFrameworkCore.Design 5.0.3
  • Microsoft.EntityFrameworkCore.SqlServer 5.0.3
  • Microsoft.EntityFrameworkCore.Tools 5.0.3

On the Startup class, in the method OnConfiguring, I enabled the option to Log into the console all the queries generated by EF Core:

For demonstration purpose, let’s consider a scenario where we have two entities with a one-to-many relationship (1:N): “Student” and “Course”, where a Course can have many Students, and a Student can only have one Course.

There are already some data that will be created into the database when the application runs:

Course: Information Systems
Student: John
Student: Mark
Course: Computer Science
Student: Luke

Without Split Query

In this example, we are creating a LINQ query using Include, without using the Slipt query, to return the courses and the students:

In this case, EF Core will create a single query where it will be returned the information for the Course and for the Students:

The result of this query will be the following:

Id  Name                  Id  CourseId   Name
1 Information Systems 1 1 John
1 Information Systems 2 1 Mark
2 Computer Science 3 2 Luke

As we can see, the property “Name” was returned twice, because there are two students registered in this course. Now think that for more complex queries with more columns and more relations, this amount of duplicated data can be a big problem.

With Split Query

Now with EF Core 5, we have the possibility to use the method “AsSplitQuery”, to split the query:

In this case, when using the AsSplitQuery method, EF Core will create two queries:

  • In the first query, it will be returned only the information related to the Courses
  • In the second query, it will be returned only the information related to the Students

The queries will be like this:

The result of this query will be the following:

// First query:
Id Name
1 Information Systems
2 Computer Science
// Second query:
Id CourseId Name Id
1 1 John 1
2 1 Mark 1
3 2 Luke 2

As we can see, with SpliQuery, the “Name” of the course was returned only once. This means that even if we have more complex queries with more columns and more relations, the data will not be duplicated, which will result in a much more performant query.

Configuring Split Query Globally

It’s also possible to enable SplitQuery globally, on the OnConfiguring method in the Startup class, using the extension method “UseQuerySplittingBehaviour”:

When configuring it globally, it’s not necessary to use the method “AsSplitQuery” in the LINQ query anymore.

Advantages of using the Split Query:

  • Split query avoids performance issues associated with JOINs
  • A problem that the Split query solves, is the problem known as “cartesian explosion”. An example of this problem is when a blog has multiple related posts, and the rows for these posts will duplicate the blog’s information. The more one-to-many relationships are loaded, the amount of duplicated data may grow and adversely affect the performance of your application.

Disadvantages of using the Split Query:

  • There are no guarantees for data consistency for multiple queries, unlike when there is only a single query. When using multiple queries, if the database is updated concurrently when executing the queries, the resulting data may not be consistent.
  • Each query currently implies an additional roundtrip to the database, and multiple network roundtrips can degrade performance.
  • Some databases allow only a single query to be active at any given point, so all results from earlier queries must be buffered in your application’s memory before executing later queries, which leads to increased memory requirements.

Conclusion

Having a single SQL query for each LINQ query ensures consistency of the data that will be returned, however, this can become very slow when the query uses Include or a projection to bring back multiple related collections. Split into multiple SQL queries, can significantly improve the performance of the application, but can result in inconsistency in the returned results if the data changes between the two queries. So this is something that you always need to consider when thinking about using the split query method.

You can check the code of this project here:

https://github.com/henriquesd/EntityFrameworkCore5Examples

If you like this solution, I kindly ask you to give a ⭐️ in the repository.

Thanks for reading!

--

--