EF:懒惰装载,热切装载和“枚举可枚举” [英] EF: Lazy loading, eager loading, and "enumerating the enumerable"

查看:164
本文介绍了EF:懒惰装载,热切装载和“枚举可枚举”的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



首先,这两个语句是等价的:



$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ L y s s s s s s s s s s s s s s s s s s。。。。。。 )
.Include(visit)。Include(events);

(2)渴望加载:
_flaggedDates = context.FlaggedDates;

换句话说,在(1)包含中导致导航集合/属性被加载随着请求的具体集合,不管你是否使用懒惰加载的事实?对吗?



而在(2)中,该语句将加载所有的导航即使您没有特别要求他们,因为您正在使用热切的加载...对吗?



第二:即使您正在使用热心加载,数据也不会实际上从数据库中下载,直到枚举可枚举,如下面的代码:

  var dates = from d in _flaggedDates 
其中d.dateID = 2
选择d;
foreach(日期中的FlaggedDate日期)
{
...等
}

数据不会实际下载(枚举),直到foreach循环...对吧?换句话说,var dates行定义查询,但直到foreach循环才执行查询。



鉴于(如果我的假设是正确的)渴望加载和懒惰加载之间真正的区别是什么?似乎在这两种情况下,数据直到枚举才会出现。我是否缺少某些东西?



(我的具体体验是使用代码优先,POCO开发,顺便说一句,虽然这些问题可能会更普遍地适用。 p>

解决方案

您的描述(1)是正确的,但它是一个例子,而不是懒惰加载。 b
$ b

您对(2)的描述不正确。 (2)在技术上根本不使用任何加载,但是如果您尝试访问FlaggedDates上的任何非标量值,将使用Lazy Loading。



在任一种情况下,您是正确的,在您尝试使用_flaggedDates执行某些操作之前,不会从数据存储中加载任何数据。不过,在每种情况下会发生什么不同。



(1):加载:一旦你开始循环,您指定的每个对象将从数据库中拉出并内置到巨大的内存数据结构中。这将是一个非常昂贵的操作,从您的数据库中提取大量的数据。但是,它将全部发生在一个数据库往返中,单个SQL查询执行。



(2):延迟加载:当您的对于循环开始,它只会加载FlaggedDates对象。但是,如果您访问中的相关对象循环,则不会将这些对象加载到内存中。为给定的FlaggedDate检索scheduledSchools的第一个尝试将导致新的数据库往返来检索学校,或者由于您的上下文已被处理而被抛出异常。由于您将访问 c $ c>循环中的scheduledSchools集合,因此您将首先在<$ c开头时加载的每个FlaggedDate都有一个新的数据库往返行程$ c $ for $ / code code code code code code code $ c

$ b

回复评论



禁用延迟加载与启用渴望加载不一样。在这个例子中:

  context.ContextOptions.LazyLoadingEnabled = false; 
var schools = context.FlaggedDates.First()。scheduledSchools;

学校变量将包含一个空的EntityCollection ,因为我没有在原始查询(FlaggedDates.First())中包含它们,并且我禁用了延迟加载,以便在初始查询后无法加载它们已经执行了。



你是正确的,其中d.dateID == 2 的将意味着只有对象与该特定的FlaggedDate对象相关的内容将被拉入。但是,根据与该FlaggedDate有多少对象,您仍然可以通过该线路获得大量数据。这是由于EntityFramework构建其SQL查询的方式。 SQL查询结果始终以表格格式显示,这意味着每行必须具有相同数量的列。对于每个scheduledSchool对象,结果集中至少需要一行,并且由于每行必须至少包含每个列的一些值,所以您最终会得到FlaggedDate上的每个标量值对象被重复。因此,如果您有10个scheduledSchools和10个与您的FlaggedDate相关联的访问,那么最终会有20行,每行包含FlaggedDate上的每个标量值。对于所有的计划学习列,一半的行将具有空值,另一半将为所有采访列设置空值。



但是,如果您在包含的数据中深入了。例如,如果每个ScheduledSchool都有一个学生属性,您也包括在内,那么突然间,您将在每个ScheduledSchool中的每个学生以及每个学生中的每一行,将包括Student's ScheduledSchool的每个标量值(即使只有第一行的值最终被使用),以及原始FlaggedDate对象上的每个标量值。它可以快速加起来。



很难以书面解释,但如果您查看从多个包含的查询返回的实际数据 s,你会看到有很多重复的数据。您可以使用LinqPad查看EF代码生成的SQL查询。


I find I'm confused about lazy loading, etc.

First, are these two statements equivalent:

(1) Lazy loading:
_flaggedDates = context.FlaggedDates.Include("scheduledSchools")
.Include  ("interviews").Include("partialDayAvailableBlocks")
.Include("visit").Include("events");

(2) Eager loading:
_flaggedDates = context.FlaggedDates;

In other words, in (1) the "Includes" cause the navigation collections/properties to be loaded along with the specific collection requested, regardless of the fact that you are using lazy loading ... right?

And in (2), the statement will load all the navigation entities even though you do not specifically request them, because you are using eager loading ... right?

Second: even if you are using eager loading, the data will not actually be downloaded from the database until you "enumerate the enumerable", as in the following code:

var dates = from d in _flaggedDates
            where d.dateID = 2
            select d;
foreach (FlaggedDate date in dates)
{
... etc.
}

The data will not actually be downloaded ("enumerated") until the foreach loop ... right? In other words, the "var dates" line defines the query, but the query is not executed until the foreach loop.

Given that (if my assumptions are correct), what's the real difference between eager loading and lazy loading?? It seems that in either case, the data does not appear until the enumeration. Am I missing something?

(My specific experience is with code-first, POCO development, by the way ... though the questions may apply more generally.)

解决方案

Your description of (1) is correct, but it is an example of Eager Loading rather than Lazy Loading.

Your description of (2) is incorrect. (2) is technically using no loading at all, but will use Lazy Loading if you try to access any non-scalar values on your FlaggedDates.

In either case, you are correct that no data will be loaded from your data store until you attempt to "do something" with the _flaggedDates. However, what happens is different in each case.

(1): Eager loading: as soon as you begin your for loop, every one of the objects that you have specified will get pulled from the database and built into a gigantic in-memory data structure. This will be a very expensive operation, pulling an enormous amount of data from your database. However, it will all happen in one database round trip, with a single SQL query getting executed.

(2): Lazy loading: When your for loop begins, it will only load the FlaggedDates objects. However, if you access related objects inside your for loop, it will not have those objects loaded into memory yet. The first attempt to retrieve the scheduledSchools for a given FlaggedDate will result in either a new database roundtrip to retrieve the schools, or an Exception being thrown because your context has already been disposed. Since you'd be accessing the scheduledSchools collection inside a for loop, you would have a new database round trip for every FlaggedDate that you initially loaded at the beginning of the for loop.

Reponse to Comments

Disabling Lazy Loading is not the same as Enabling Eager Loading. In this example:

context.ContextOptions.LazyLoadingEnabled = false;
var schools = context.FlaggedDates.First().scheduledSchools;

The schools variable will contain an empty EntityCollection, because I didn't Include them in the original query (FlaggedDates.First()), and I disabled lazy loading so that they couldn't be loaded after the initial query had been executed.

You are correct that the where d.dateID == 2 would mean that only the objects related to that specific FlaggedDate object would be pulled in. However, depending on how many objects are related to that FlaggedDate, you could still end up with a lot of data going over that wire. This is due to the way the EntityFramework builds out its SQL query. SQL Query results are always in a tabular format, meaning you must have the same number of columns for every row. For every scheduledSchool object, there needs to be at least one row in the result set, and since every row has to contain at least some value for every column, you end up with every scalar value on your FlaggedDate object being repeated. So if you have 10 scheduledSchools and 10 interviews associated with your FlaggedDate, you'll end up with 20 rows that each contain every scalar value on FlaggedDate. Half of the rows will have null values for all the ScheduledSchool columns, and the other half will have null values for all of the Interviews columns.

Where this gets really bad, though, is if you go "deep" in the data you're including. For example, if each ScheduledSchool had a students property, which you included as well, then suddenly you would have a row for each Student in each ScheduledSchool, and on each of those rows, every scalar value for the Student's ScheduledSchool would be included (even though only the first row's values end up getting used), along with every scalar value on the original FlaggedDate object. It can add up quickly.

It's difficult to explain in writing, but if you look at the actual data coming back from a query with multiple Includes, you will see that there is a lot of duplicate data. You can use LinqPad to see the SQL Queries generated by your EF code.

这篇关于EF:懒惰装载,热切装载和“枚举可枚举”的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆