使用Entity Framework执行简单查询时出现严重的性能问题 [英] Severe performance problems when performing a simple query using Entity Framework

查看:93
本文介绍了使用Entity Framework执行简单查询时出现严重的性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个相当通用的CRUD webapp,它根据几个数据库表的内容动态生成页面。我正在使用Entity Framework 4.0将数据从数据库中提取出来,但是我遇到了严重的性能问题。我已经设法迭代了一个包含足够的问题,我可以在下面详细说明。



我有一个包含页面表单列表(〜200)。每个表单都有一个或多个字段(总共约4000个),每个字段可能有一些参数(共16000)。



我附上了我的模型屏幕截图:





关联的实体对象如下:

  public class Form 
{
public int FormID {get;组; }
public string FormName {get;组; }

public IList< FormField> FormFields {get;组; }

}

public class FormField
{
public int FieldID {get;组; }
public string FieldName {get;组; }
public int FormID {get;组; }

public IList< FormFieldParameter> FormFieldParameters {get;组; }
public Form ParentForm {get;组; }

}

public class FormFieldParameter
{
public int FieldParamID {get;组; }
public string Value {get;组; }
public int? FieldID {get;组; }

public FormField ParentField {get;组; }
}

以下代码提取了Form的所有数据,该数据的ID为'1'。

  EntityConnection myConnection = new EntityConnection(name = myModel); 

if(conn.State!= ConnectionState.Open){
conn.Open();
}
ObjectContext context = new ObjectContext(name = myModel);
context.ContextOptions.LazyLoadingEnabled = false;

ObjectQuery< PageForm> myObjectSet = context.CreateObjectSet< PageForm>()
.Include(FormField.FormFieldParameter);

//编辑:我错过了这个部分,对不起。事后看来,这正是导致这个问题的原因。
IEnumerable< PageForm> myObjectSetEnumerable = myObjectSet.AsEnumerable();
IQueryable< PageForm> myFilteredObjectSet = myObjectSetEnumerable.Where(c => c.FormID == 1)
.AsQueryable();


列表< PageForm> myReturnValue = myFilteredObjectSet.toList();

现在,虽然这样做的确很糟糕。这个查询需要一秒钟的时间才能运行,其中的全部消耗在 myFilteredObjectSet.toList()调用中。我在我的数据库上运行了一个分析器来查看导致延迟的原因,并发现正在生成以下查询:

  SELECT 
[Project1]。[FormID] AS [FormID],
[Project1]。[FormName] AS [FormName],
[Project1]。[C2] AS [C1],
[Project1]。[FormID1] AS [FormID1],
[Project1]。[FieldID] AS [FieldID],
[Project1]。[FieldName] AS [FieldName],
[项目1] [C1] AS [C2],
[Project1]。[FieldParamID] AS [FieldParamID],
[Project1]。[Value] AS [Value],
[Project1 ] [FieldID1] AS [FieldID1]
FROM(SELECT
[Extent1]。[FormID] AS [FormID],
[Extent1]。[FormName] AS [FormName],
[Join1]。[FieldID] AS [FieldID],
[Join1]。[FieldName] AS [FieldName],
[Join1]。[FormID] AS [FormID1],
[Join1]。[FieldParamID] AS [FieldParamID],
[Join1]。[Value] AS [Value],
[Join1]。[FieldID1] AS [FieldID1],
CASE WHEN ([Join1]。[FieldID] IS NULL)THEN CAST(NULL AS int)W HEN([Join1]。[FieldParamID] IS NULL)THEN CAST(NULL AS int)ELSE 1 END AS [C1],
CASE WHEN([Join1]。[FieldID] IS NULL)THEN CAST )[2]
FROM [dbo]。[PageForm] AS [Extent1]
LEFT OUTER JOIN(SELECT [Extent2]。[FieldID] AS [FieldID],[Extent2]。[ [FieldName] AS [FieldName],[Extent2]。[FormID] AS [FormID],[Extent3]。[FieldParamID] AS [FieldParamID],[Extent3]。[Value] AS [Value],[Extent3] AS [FieldID1]
FROM [dbo]。[FormField] AS [Extent2]
LEFT OUTER JOIN [dbo]。[FormFieldParameter] AS [Extent3] ON [Extent2]。[FieldID] = [Extent3] 。[FieldID])AS [Join1] ON [Extent1]。[FormID] = [Join1]。[FormID]
)AS [Project1]
ORDER BY [Project1]。[FormID] ASC,项目1]。[C2] ASC,[Project1]。[FieldID] ASC,[Project1]。[C1] ASC

sql分析器上显示的此查询的持续时间显示此查询是运行时间长的时间。查询的有趣之处在于它根本没有过滤 - 它返回整个树!我不明白为什么它返回所有内容,因为过滤器 myObjectSet.Where(c => c.FormID == 1)是非常明确的。实际返回的对象只包含一个单个条目,我期望的。



我在整个数据访问层遇到这个问题,其性能令人震惊。我不知道为什么生成的查询不包含过滤器 - 不知道如何告诉它这样做。有人知道答案吗?

解决方案

TL; DR删除 AsEnumerable 调用并替换为 AsQueryable 调用,它应该解决大多数性能问题(实际数据库执行成本之外的速度很慢,这是通过在列上添加索引来修复的过滤/加入)



实际发生的事情的说明...



一旦你调用 AsEnumerable ,你现在就超出了实体框架和LINQ对象的世界。这意味着它将针对数据库执行针对数据库的查询。再次调用 AsQueryable 并不重要,这仅仅意味着您正在针对内存结构创建一个查询。



有效的执行是这样的。


  1. 创建一个对象查询,包括链接到表单
  2. $ b的所有FormFieldProperties $ b
  3. 将当前的IQueryable实例转换为可枚举的。

  4. 向枚举实例添加一个谓词,该实例只返回FormID值为一个的项。

  5. 调用ToList,将所有值从源可枚举复制到列表。

现在,直到第4步,查询实际上没有查询数据库。当您调用 ToList 时,它会在第一步执行查询(如您所见)。这个查询可能是昂贵的,需要一段时间,因为它返回的数据量和/或丢失的索引可能会提高其性能。



一旦查询完成,



现在,每个对象被迭代和检查,看看它是否匹配在步骤3中添加的谓词。如果它是匹配,那么它返回给任何正在迭代的人(在这种情况下是ToList函数)。



现在值已经返回,它被添加到正在使用值创建的列表。



最后,您从 ToList 方法中获取列表,它具有您所要求的完全符合它在记忆中而不是在数据库中完成所有这些。


I've got a fairly generic CRUD webapp, which generates pages dynamically according to the contents of several database tables. I'm using Entity Framework 4.0 to pull this data out of the DB, however I'm running into severe performance problems. I've managed to iterate down into a problem which is contained enough that I can detail below.

I have a table containing list of Page Forms (~200). Each form has one or more Fields (~4000 total), and each field has may have some Parameters (~16000 total).

I've attached a screenshot of my model below:

The associated entity objects are as follows:

public class Form
{
    public int FormID { get; set; }
    public string FormName { get; set; }

    public IList<FormField> FormFields { get; set; }

}

public class FormField
{
    public int FieldID { get; set; }
    public string FieldName { get; set; }
    public int FormID{ get; set; } 

    public IList<FormFieldParameter> FormFieldParameters { get; set; }
    public Form ParentForm { get; set; }

}

public class FormFieldParameter
{
    public int FieldParamID{ get; set; }
    public string Value{ get; set; }
    public int? FieldID { get; set; }

    public FormField ParentField { get; set; }
}

The following code pulls out all data for the Form which has an ID of '1'.

EntityConnection myConnection = new EntityConnection("name=myModel");

if(conn.State != ConnectionState.Open) {
    conn.Open();
}
ObjectContext context = new ObjectContext("name=myModel");
context.ContextOptions.LazyLoadingEnabled = false;

ObjectQuery<PageForm> myObjectSet = context.CreateObjectSet<PageForm>()
                                           .Include("FormField.FormFieldParameter");

//Edit: I missed this part out, sorry. In hindsight, this was exactly what was
//causing the issue.
IEnumerable<PageForm> myObjectSetEnumerable = myObjectSet.AsEnumerable();
IQueryable<PageForm> myFilteredObjectSet = myObjectSetEnumerable.Where(c => c.FormID == 1)
                                                                .AsQueryable();


List<PageForm> myReturnValue = myFilteredObjectSet.toList();

Now, while this does work, it runs really poorly. The query takes over a second to run, the entirety of which is spent in the myFilteredObjectSet.toList() call. I ran a profiler on my database to see what was causing the delay, and found that the following query was being generated:

SELECT 
[Project1].[FormID] AS [FormID], 
[Project1].[FormName] AS [FormName], 
[Project1].[C2] AS [C1], 
[Project1].[FormID1] AS [FormID1], 
[Project1].[FieldID] AS [FieldID], 
[Project1].[FieldName] AS [FieldName], 
[Project1].[C1] AS [C2], 
[Project1].[FieldParamID] AS [FieldParamID], 
[Project1].[Value] AS [Value], 
[Project1].[FieldID1] AS [FieldID1]
FROM ( SELECT 
    [Extent1].[FormID] AS [FormID], 
    [Extent1].[FormName] AS [FormName], 
    [Join1].[FieldID] AS [FieldID], 
    [Join1].[FieldName] AS [FieldName], 
    [Join1].[FormID] AS [FormID1], 
    [Join1].[FieldParamID] AS [FieldParamID], 
    [Join1].[Value] AS [Value], 
    [Join1].[FieldID1] AS [FieldID1], 
    CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) WHEN ([Join1].[FieldParamID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1], 
    CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [dbo].[PageForm] AS [Extent1]
    LEFT OUTER JOIN  (SELECT [Extent2].[FieldID] AS [FieldID], [Extent2].[FieldName] AS [FieldName], [Extent2].[FormID] AS [FormID], [Extent3].[FieldParamID] AS [FieldParamID], [Extent3].[Value] AS [Value], [Extent3].[FieldID] AS [FieldID1]
        FROM  [dbo].[FormField] AS [Extent2]
        LEFT OUTER JOIN [dbo].[FormFieldParameter] AS [Extent3] ON [Extent2].[FieldID] = [Extent3].[FieldID] ) AS [Join1] ON [Extent1].[FormID] = [Join1].[FormID]
)  AS [Project1]
ORDER BY [Project1].[FormID] ASC, [Project1].[C2] ASC, [Project1].[FieldID] ASC, [Project1].[C1] ASC

The duration of this query shown on the sql profiler shows that this query is what is taking so long to run. The interesting thing about the query, is that there is no filtering on it at all - It is returning the entire tree! I can't understand why it is returning everything, as the filter myObjectSet.Where(c => c.FormID == 1) is pretty explicit. The actual returned object only contains a single entry, which I would expect.

I'm having this problem across my entire data access layer, and its performance is appalling. I have no idea why the generated query doesn't contain the filter - and no idea how to tell it to do so. Does anybody know the answer?

解决方案

TL;DR Remove the AsEnumerable call and replace it with an AsQueryable call and it should resolve most of the performance issues (outside of actual database execution cost being slow, which is fixed by adding indexes on columns you are filtering / joining on).

Explanation of what is actually happening...

As soon as you call AsEnumerable you are now outside of Entity Framework and in the world of LINQ-to-objects. That means it's going to execute the query against the database when it is enumerated against. It doesn't matter that you call AsQueryable again, that merely means that you are creating a query against an in memory structure.

The effective execution is this.

  1. Create an object query, including all FormFieldProperties linked to the form
  2. Transform the current IQueryable instance into an enumerable.
  3. Add a predicate against the enumerable instance which will only return items whose FormID value is one.
  4. Call ToList, which copies all values from source enumerable to a list.

Now, up until step 4, the query actually hasn't queried the database. When you call ToList, it executes the query in step one (as you see). This query likely is expensive and takes a while because of the amount of data it is returning and/or missing indexes that may improve it's performance.

Once that query is done and materialized, it's result is wrapped in an enumerator.

Now, every object is iterated and checked to see if it matches the predicate that was added in step 3. If it does match, then it is returned to whoever is iterating over it (in this case, the ToList function).

Now that the value has been returned, it is added to the list that is being created with the values.

Finally, you get a list back from the ToList method, and it has exactly what you asked for, but it did all of that in memory rather than in the database.

这篇关于使用Entity Framework执行简单查询时出现严重的性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆