如何让EF6生成包含多个聚合列的高效SQL? [英] How do I get EF6 to generate efficient SQL containing mulitple aggregate columns?

查看:198
本文介绍了如何让EF6生成包含多个聚合列的高效SQL?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图让Entity Framework(v6.1.3)为具有多个聚合的查询生成高效的SQL。



这是一个简化的例子。 p>

表:

  CREATE TABLE [dbo]。[CaseAttorney](
[CaseAttorneyID] [int] IDENTITY(1,1)NOT NULL,
[CaseNumber] [varchar](30)NOT NULL,
[AttorneyID] [int] NOT NULL,
[DateAssigned] [datetime] NULL,
[DateUnassigned] [datetime] NULL,
CONSTRAINT [PK_CaseAttorney] PRIMARY KEY CLUSTERED

[CaseAttorneyID] ASC
) WITH(PAD_INDEX = OFF,STATISTICS_NORECOMPUTE = OFF,IGNORE_DUP_KEY = OFF,ALLOW_ROW_LOCKS = ON,ALLOW_PAGE_LOCKS = ON)ON [PRIMARY]
)ON [PRIMARY]

C#:

  using(var cx = new DATA 
{

var startDate = DateTime.Parse(1/1/2014);
var endDate = startDate.AddDays(1);

cx.Database.Log = Console.WriteLine;

var res = cx.CaseAttorneys。
GroupBy(o => new
{
AttorneyID = o.AttorneyID
})选择(g => new
{
AttorneyID = g.Key.AttorneyID
ActiveStart = g.Sum(item =>(item.DateAssigned< startDate&&(item.DateUnassigned == null || item.DateUnassigned> = startDate)?1 :0)),
Assigned = g.Sum(item =>(item.DateAssigned> = startDate&& item.DateAssigned< = endDate)?1:0)
}) .ToArray();

}

而不是使用单个 GROUP BY ,我得到一个非常低效的查询,包含多个嵌套表。这种情况发生在 COUNT SUM 之间:

  SELECT 
[Project3]。[AttorneyID] AS [AttorneyID],
[Project3]。[C1] AS [C1],
[Project3] C2] AS [C2]
FROM(SELECT
[Project2]。[AttorneyID] AS [AttorneyID],
[Project2]。[C1] AS [C1],
SELECT
SUM([Filter2]。[A1])AS [A1]
FROM(SELECT
CASE WHEN(([Extent3]。[DateAssigned]> = @ p__linq__2)AND([ Extend3]。[DateAssigned]< = @ p__linq__3))THEN 1 ELSE 0 END AS [A1]
FROM [dbo]。[CaseAttorney] AS [Extent3]
WHERE [Project2]。[AttorneyID] = [Extent3]。[AttorneyID]
)AS [Filter2])AS [C2]
FROM(SELECT
[Distinct1]。[AttorneyID] AS [AttorneyID],
SELECT
SUM([Filter1]。[A1])AS [A1]
FROM(SELECT
CASE WHEN(([Extent2]。[DateAssigned]< @ p__linq__0)AND(([ Extent2] [DateUnassigned] IS [Ext] 2 [$] $ [$] $ [$] $ [$] $ [$] $ [$] [AttorneyID] = [Extent2]。[AttorneyID]
)AS [Filter1] AS [C1]
FROM(SELECT DISTINCT
[Extent1]。[AttorneyID] AS [ [CaseAttorney] AS [Extent1]
)AS [Distinct1]
)AS [Project2]
)AS [Project3]

如果没有一遍又一遍地打到相同的表,嵌套本身并不会太糟糕。这个问题变得越来越多,添加的聚合列越多。



我没有发现任何类似的问题,所以我确定我做错了。 p>

当我想要返回多个汇总列时,获取实体框架以生成高效投影的正确方法是什么?

解决方案

Count(predicate)(实际上任何涉及谓词的函数)似乎都对生成的SQL查询产生了影响。然而,条件 Sum (即 Sum(谓词?1:0))没有这样的影响,所以以下将会做你想要的:



更新:事实证明, Sum 伎俩是必要的,但是当谓词使用像你的情况下的变量不够。它最可能是EF错误,因为玩不同的 GroupBy 重载无效,除非您在之前包含条件表达式之前的临时投影的GroupBy



所以(最后)以下查询

  db.CaseAttorneys。选择(item => new 
{
Item = item,
ActiveStart = item.DateAssigned< startDate&&(item.DateUnassigned == null || item.DateUnassigned> = startDate)?1:0,
分配= item.DateAssigned> = startDate&&  item.DateAssigned< = endDate?1:0
})
.GroupBy => new
{
AttorneyID = o.Item.AttorneyID
})
.Select(g => new
{
AttorneyID = g .Key.AttorneyID
ActiveStart = g.Sum(item => item.ActiveStart),
Assigned = g.Sum(item => item.Assigned)
})ToArray ();

产生所需的SQL

  SELECT 
[GroupBy1]。[K1] AS [AttorneyID],
[GroupBy1]。[A1] AS [C1],
[GroupBy1]。[A2 ] AS [C2]
FROM(SELECT
[Extent1]。[K1] AS [K1],
SUM([Extent1]。[A1])AS [A1],
[SU]([Extent1]。[A2])AS [A2]
FROM(SELECT
[Extent1]。[AttorneyID] AS [K1],
CASE WHEN(([Extent1])。 [DateAssigned]< @ p__linq__0)AND(([Extent1]。[DateUnassigned] IS NULL)OR([Extent1]。[DateUnassigned]> = @ p__linq__1)))THEN 1 ELSE 0 END AS [A1],
CASE WHEN([Extent1]。[DateAssigned]> = @ p__linq__2)AND([Extent1]。[DateAssigned]< = @ p__linq__3))THEN 1 ELSE 0 END AS [A2]
FROM [ dbo]。[CaseAttorneys] AS [Extent1]
)AS [Extent1]
GROUP BY [K1]
)AS [GroupBy1]
pre>

I'm trying to get the Entity Framework (v6.1.3) to generate efficient SQL for a query with multiple aggregates.

Here's a simplified example.

Table:

CREATE TABLE [dbo].[CaseAttorney](
[CaseAttorneyID] [int] IDENTITY(1,1) NOT NULL,
[CaseNumber] [varchar](30) NOT NULL,
[AttorneyID] [int] NOT NULL,
[DateAssigned] [datetime] NULL,
[DateUnassigned] [datetime] NULL,
 CONSTRAINT [PK_CaseAttorney] PRIMARY KEY CLUSTERED 
(
    [CaseAttorneyID] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

C#:

using (var cx = new DATA())
{

    var startDate = DateTime.Parse("1/1/2014");
    var endDate = startDate.AddDays(1);

    cx.Database.Log = Console.WriteLine;

    var res = cx.CaseAttorneys.
        GroupBy(o => new
        {
            AttorneyID = o.AttorneyID
        }).Select(g => new
        {
            AttorneyID = g.Key.AttorneyID,
            ActiveStart = g.Sum(item => (item.DateAssigned < startDate && (item.DateUnassigned == null || item.DateUnassigned >= startDate) ? 1 : 0)),
            Assigned = g.Sum(item => (item.DateAssigned >= startDate && item.DateAssigned <= endDate) ? 1 : 0)
        }).ToArray();

}

Instead of generating a query with a single GROUP BY, I get a very inefficient query containing multiple nested tables. This happens with both COUNTand SUM:

SELECT 
[Project3].[AttorneyID] AS [AttorneyID], 
[Project3].[C1] AS [C1], 
[Project3].[C2] AS [C2]
FROM ( SELECT 
    [Project2].[AttorneyID] AS [AttorneyID], 
    [Project2].[C1] AS [C1], 
    (SELECT 
        SUM([Filter2].[A1]) AS [A1]
        FROM ( SELECT 
            CASE WHEN (([Extent3].[DateAssigned] >= @p__linq__2) AND ([Extent3].[DateAssigned] <= @p__linq__3)) THEN 1 ELSE 0 END AS [A1]
            FROM [dbo].[CaseAttorney] AS [Extent3]
            WHERE [Project2].[AttorneyID] = [Extent3].[AttorneyID]
        )  AS [Filter2]) AS [C2]
    FROM ( SELECT 
        [Distinct1].[AttorneyID] AS [AttorneyID], 
        (SELECT 
            SUM([Filter1].[A1]) AS [A1]
            FROM ( SELECT 
                CASE WHEN (([Extent2].[DateAssigned] < @p__linq__0) AND (([Extent2].[DateUnassigned] IS NULL) OR ([Extent2].[DateUnassigned] >= @p__linq__1))) THEN 1 ELSE 0 END AS [A1]
                FROM [dbo].[CaseAttorney] AS [Extent2]
                WHERE [Distinct1].[AttorneyID] = [Extent2].[AttorneyID]
            )  AS [Filter1]) AS [C1]
        FROM ( SELECT DISTINCT 
            [Extent1].[AttorneyID] AS [AttorneyID]
            FROM [dbo].[CaseAttorney] AS [Extent1]
        )  AS [Distinct1]
    )  AS [Project2]
)  AS [Project3]

The nesting in and of itself wouldn't be too bad if it didn't keep hitting the same tables over and over again. This problem gets worse the more aggregate columns are added.

I haven't found any similar questions here, so I'm sure I'm doing something wrong.

What is the correct way of getting the Entity Framework to generate an efficient projection when I want to return multiple aggregate columns?

解决方案

Count(predicate) (and actually any function involving predicates) seems to have that effect on the generated SQL query.

However, the conditional Sum (i.e. Sum(predicate ? 1 : 0)) has no such affect, so the following will do what you want:

Update: It turns out that the Sum trick is necessary, but not enough when predicates use variables like in your case. It most probably is EF bug, because playing with different GroupBy overloads doesn't help, except if you include a temporary projection including conditional expressions before doing GroupBy.

So (finally) the following query

db.CaseAttorneys.Select(item => new
{
    Item = item,
    ActiveStart = item.DateAssigned < startDate && (item.DateUnassigned == null || item.DateUnassigned >= startDate) ? 1 : 0,
    Assigned = item.DateAssigned >= startDate && item.DateAssigned <= endDate ? 1 : 0
})
.GroupBy(o => new
{
    AttorneyID = o.Item.AttorneyID
})
.Select(g => new
{
    AttorneyID = g.Key.AttorneyID,
    ActiveStart = g.Sum(item => item.ActiveStart),
    Assigned = g.Sum(item => item.Assigned)
}).ToArray();

produced the desired SQL

SELECT 
    [GroupBy1].[K1] AS [AttorneyID], 
    [GroupBy1].[A1] AS [C1], 
    [GroupBy1].[A2] AS [C2]
    FROM ( SELECT 
        [Extent1].[K1] AS [K1], 
        SUM([Extent1].[A1]) AS [A1], 
        SUM([Extent1].[A2]) AS [A2]
        FROM ( SELECT 
            [Extent1].[AttorneyID] AS [K1], 
            CASE WHEN (([Extent1].[DateAssigned] < @p__linq__0) AND (([Extent1].[DateUnassigned] IS NULL) OR ([Extent1].[DateUnassigned] >= @p__linq__1))) THEN 1 ELSE 0 END AS [A1], 
            CASE WHEN (([Extent1].[DateAssigned] >= @p__linq__2) AND ([Extent1].[DateAssigned] <= @p__linq__3)) THEN 1 ELSE 0 END AS [A2]
            FROM [dbo].[CaseAttorneys] AS [Extent1]
        )  AS [Extent1]
        GROUP BY [K1]
    )  AS [GroupBy1]

这篇关于如何让EF6生成包含多个聚合列的高效SQL?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆