With the recent release of .NET Core 2.1 and Entity Framework Core 2.1 thereof, I thought I could share a few tips and best practices on how to use EF in the most efficient way and avoid some common pitfalls.
I’ve divided these tips into four sections: maintainability, performance, troubleshooting and testing. Without further ado, let’s get going!
Maintainability
Use eager loading
Before EF Core was released I was one of those developers who were used to the comfort of lazy loading. I knew it was kind of easy to accidentally trigger the lazy evaluation of some navigation properties but over time I learned to avoid those cases in the first place. Things worked quite smoothly especially on smaller projects and the code was pretty concise given that EF was responsible for retrieving the data for me whenever needed.
Then EF Core was released, and boom, it didn’t have lazy loading at all. It only had eager (and explicit) loading so I thought it was probably time to give it try. Now, after using it in many projects for about two years I can safely say that it makes a developer’s life better in many ways. Most imporantly eager loading forces you to think about data access patterns: what data you need to access and how. Some of the additional benefits of eager loading that I can think of:
- Nothing is included by default so you get no suprises
- It is much easier to understand how your code gets turned into SQL and executed
- Your code will perform better in most cases
- You can use true POCO entity classes without leaking in EF concepts in them (e.g. virtual navigation properties)
Group and re-use include statements
One of the downsides of eager loading is that your code can quickly become somewhat polluted with include statements.
That works just fine but when your application grows you’ll notice that those Include-statements are absolutely everywhere. You might also notice that you need to include the same properties (or a subset of those) in many different queries making it more difficult to manage them. To keep the performance optimal you shouldn’t load unnecessary properties but you also need to make sure you’ve loaded all the necessary ones for any given scenario.
What we can do to overcome this problem is to extract the Include-calls to separate extension methods.
Now you can use those extensions in queries:
This will give you a more manageable approach that is also easier to read and understand for other developers.
Don’t initialize collections manually
One lesson that I’ve learned from practice is not to manually initialize any collections in your entities. The reason is, if you forget to include a navigation property then initializing it in the constructor will only cover that bug. Instead, write comprehensive tests that will catch these errors in the first place.
Performance
Always async-await
If you’re not doing it already, start using async and await to increase performance of your application. Especially on I/O intensive operations like heavier SQL queries this will allow your application to respond to other requests while waiting for the previous operations to complete.
Avoid client evaluation
Client evaluation must be one of the most dangerous features of EF Core since it’s such easy to ignore:
Any LINQ expressions you write to query your DB, EF Core will try to translate it into SQL and if it can’t the query will be evaluated in client (without you knowing, unless you look at the logs).
What makes that nice is that almost any query you write will work but, and this is a big but: it can kill the performance. If you query a large dataset any client evaluation will only happen after all the rows have been returned from DB. This has potential to cause severe performance and memory issues if you’re not careful.
Whenever EF falls back to evaluating a query in client it will log a warning (in case you have logging enabled, that is). You can also disable client evaluation completely.
Use projections where appropriate
Projections are easily one of the most important concepts of keeping your queries and application performant: whenever you need to load a larger set of entities from database, e.g. to show a list of orders in an e-commerce website, do not query entities directly!
Instead use projections that allow you to only return those fields from the databse that you really need.
In the example above, only the data that we select will be returned from the database potentially increasing performance dramatically when compared to loading all the entities with all their fields and navigation properties. Note that you don’t need to include any navigation properties when using projections.
In general, I tend to use projections when I need to query data for views over multiple entities, like lists for example. When I’m only dealing with a single entity or need to modify data, I load the actual entity.
Optimize correlated sub-queries
Correlated sub-queries have a potential to kill query performance as they will result in N + 1 queries. EF Core 2.1 now allows you to optimize this kind of queries by utilizing buffering, resulting only in 2 queries.
Here’s an example of a correlated sub-query that results in N + 1 queries:
Running this query will first fetch all orders and then run the sub-query for each of them returning all items with amount greater than 100.
To allow buffering and squeeze this into just 2 queries (one for orders, one for line items) all you need to do is add an explicit .ToList() after the inner query:
Don’t forget to declare indexes
This isn’t anything new or fancy. Just make sure that you explicitly add indexes for fields that you query by and that aren’t primary or foreign keys.
Troubleshooting
Turn on logging
EF Core logs a lot of important and helpful messages, like any warnings for client side evaluation and other issues. Make sure that you are logging those, for example using my favorite logging library Serilog.
Disable client evalution in development environment
I strongly suggest you to disable client evaluation in the development environment to make sure that you don’t unintentionally deploy such code to production.
Testing
EF’s in-memory database isn’t a relational database
EF Core comes with an in-memory database that you can use for testing your services and API’s. It’s super useful but keep in mind that it isn’t a relational database and thus doesn’t force the constraints of one. This includes index and foreign key constraints so for example deleting an entity can succeed in the in-memory database whereas in an actual relational database it will fail.
Consider using SQLite In-memory
To overcome issues discussed previously with EF’s own in-memory database, you can decide to use SQLite in-memory instead. It’s probably not as fast or robust but it can help you catch database issues that would otherwise go unnoticed.
No comments:
Post a Comment