实体框架多重聚合性能 [英] Entity Framework multiple aggregation performance

查看:106
本文介绍了实体框架多重聚合性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



模式



我有一个表结构,如(
[ID] [bigint] IDENTITY(1, 1)NOT NULL,
[ProjectID] [bigint] NULL,
CONSTRAINT [PrimaryKey1] PRIMARY KEY CLUSTERED([ID] ASC)


CREATE TABLE [ dbo]。[DCDistributionBox](
[ID] [bigint] IDENTITY(1,1)NOT NULL,
[DataLoggerID] [bigint] NOT NULL,
CONSTRAINT [PrimaryKey2] PRIMARY KEY CLUSTERED ([ID] ASC)


ALTER TABLE [dbo]。[DCDistributionBox]
ADD CONSTRAINT [FK_DCDistributionBox_DataLogger] FOREIGN KEY([DataLoggerID])
参考文献[ dbo]。[DataLogger]([ID])

CREATE TABLE [dbo]。[DCString](
[ID] [bigint] IDENTITY(1,1)NOT NULL,
[DCDistributionBoxID] [bigint] NOT NULL,
[CurrentMPP] [decimal](18,2)NULL,
CONSTRAINT [PrimaryKey3] PRIMARY KEY CLUSTERED [ID]


ALTER TABLE [dbo]。[DCString]
ADD CONSTRAINT [FK_DCString_DCDistributionBox] FOREIGN KEY([DCDistributionBoxID])
参考文献[dbo ] [DCDistributionBox]([ID])

CREATE TABLE [dbo]。[StringData](
[DCStringID] [bigint] NOT NULL,
[TimeStamp] [datetime ] NOT NULL,
[DCCurrent] [decimal](18,2)NULL,
CONSTRAINT [PrimaryKey4] PRIMARY KEY CLUSTERED([TimeStamp] DESC,[DCStringID] ASC)

具有以下存储数据的 [StringData] 数据空间:26,901.86 MB

  • 行数:131,827,749

  • 分区:true

  • 分区数:62



  • 用法



    我现在想将数据分组到 [StringData] 表中,并进行一些聚合。



    在纯SQL中,它将如下所示:

      declare @projectID bigint = 20686; 
    declare @from datetime = '06 .02.2016';
    declare @till datetime = '07 .02.2016';
    declare @interval int = 15;

    SELECT
    DATEADD(MINUTE,DATEDIFF(MINUTE,0,[StringData]。[TimeStamp])/ @interval * @interval,0)AS [TimeStamp]
    ,AVG ([StringData]。[DCCurrent] / [DCString]。[CurrentMPP])AS [DCCurrentAvg]
    ,MIN([StringData]。[DCCurrent] / [DCString]。[CurrentMPP])AS [DCCurrentMin] $ b $ DC,$($ String)[DCCurrent] / [DCString]。[CurrentMPP])AS [DCCurrentMax]
    ,STDEV([StringData]。[DCCurrent] / [DCString] DCCurrentStDev]
    ,COUNT(*)AS [Count]

    FROM [StringData]
    JOIN [DCString] ON [DCString]。[ID] = [StringData] [DCStringID ]
    JOIN [DCDistributionBox] ON [DCDistributionBox]。[ID] = [DCString]。[DCDistributionBoxID]
    JOIN [DataLogger] ON [DataLogger]。[ID] = [DCDistributionBox]。[DataLoggerID]

    WHERE [DataLogger]。[ProjectID] = @projectID
    AND [StringData]。[TimeStamp]> = @from
    AND [StringData]。[TimeStamp] @till

    GROUP BY DATEADD(MINUTE,DATEDIFF(MINUTE,0,[StringData]。[TimeStamp])/ @interval * @interval,0)

    Excecution timespan: 653ms



    现在我创建了一个实体框架等价(我想):

      var compareData = model.StringDatas 
    AsNoTracking()
    .Where p => p.DCString.DCDistributionBox.DataLogger.ProjectID == projectID&& p.TimeStamp> = from&& p.TimeStamp< until)
    .Select(d => new
    {
    TimeStamp = d.Key,
    DCCurrentMin = d.Min(v => v.DCCurrent / v.DCString.CurrentMPP),
    DCCurrentMax = d.Max v => v.DCCurrent / v.DCString.CurrentMPP),
    DCCurrentAvg = d.Average(v => v.DCCurrent / v.DCString.CurrentMPP),
    DCCurrentStDev = DbFunctions.StandardDeviation d.Select(v => v.DCCurrent / v.DCString.CurrentMPP))
    })
    .ToList();

    而且,超时的结果是超时(30秒以上)! b
    $ b

    尝试



    我现在看了实体框架生成的SQL查询,如下所示:

      SELECT 
    1 AS [C1],
    [Project10]。[C1] AS [C2],
    [Project10 ] [C2] AS [C3],
    [Project10]。[C3] AS [C4],
    [Project10]。[C4] AS [C5],
    [Project10]。 [C5] AS [C6]
    FROM(SELECT
    [Project8]。[C1] AS [C1],
    [Project8]。[C2] AS [C2],
    [C3] AS [C3],
    [Project8] [C4] AS [C4],
    (SELECT
    STDEV([Project9]。[A1])AS [ [DCCurrent] / [Project9]。[CurrentMPP] AS [A1]
    FROM(SELECT
    [Extent17]。[DCStringID] AS [DCStringID],
    [Extent17]。[DCCurrent] AS [DCCurrent],
    [Extent18]。[ID] AS [ID],
    [范围18] [CurrentMPP] AS [CurrentMPP]
    FROM [dbo]。[StringData] AS [Extent17]
    INNER JOIN [dbo]。[DCString] AS [Extent18] ON [Extent17]。[DCStringID ] = [Extent18]。[ID]
    INNER JOIN [dbo]。[DCDistributionBox] AS [Extent19] ON [Extent18]。[DCDistributionBoxID] = [Extent19]。[ID]
    INNER JOIN [dbo ] [DataLogger] AS [Extent20] ON [Extent19]。[DataLoggerID] = [Extent20]。[ID]
    WHERE(([Extent20]。[ProjectID] = @ p__linq__0)OR(([Extent20])。 [ProjectID] IS NULL)AND(@ p__linq__0 IS NULL)))AND([Extent17]。[TimeStamp]> = @ p__linq__1)AND([Extent17]。[TimeStamp] @ p__linq__2)AND(([Project8])[C1] =(DATEADD(分钟,((DATEDIFF(分钟,@ p__linq__4,[Extent17]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3))) OR(([Project8]。[C1] IS NULL)AND(DATEADD(分钟,((DATEDIFF(分钟,@ p__linq__4,[Extent17]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3) ))
    )AS [Project9]
    )AS [Project9])AS [C5]
    FROM(SELECT
    [Project6]。[C1] AS [C1],$ [C2] AS [C2],
    [Project6]。[C3] AS [C3],
    (SELECT
    AVG([Project7]。[A1] )AS [A1]
    FROM(SELECT
    [Project7]。[DCCurrent] / [Project7]。[CurrentMPP] AS [A1]
    FROM(SELECT
    [Extent13]。 [DCStringID] AS [DCStringID],
    [Extent13]。[DCCurrent] AS [DCCurrent],
    [Extent14]。[ID] AS [ID],
    [Extent14]。[CurrentMPP ] AS [CurrentMPP]
    FROM [dbo]。[StringData] AS [Extent13]
    INNER JOIN [dbo]。[DCString] AS [Extent14] ON [Extent13]。[DCStringID] = [Extent14]。[ID] $ b $ [DCB] [AS] [[]] [[]] [[]] [[]] [[]] [[]] [ ] [DataLoggerID] = [Extent16]。[ID]
    WHERE(([Extent16]。[ProjectID] = @ p__linq__0)OR(([Extent16]。[ProjectID] IS NULL)AND(@ p__linq__0 IS NULL )))AND([Extent13]。[TimeStamp]> = @ p__linq__1)AND([Extent13]。[TimeStamp] @ p__linq__2)AND(([Project6])[C1] =(DATEADD(分钟,((DATEDIFF(分钟,@ p__linq__4,[Extent13]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3) OR(([Project6]。[C1] IS NULL)AND(DATEADD(minute,((DATEDIFF(minute,@ p__linq__4,[Extent13]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3) ))
    )AS [Project7]
    )AS [Project7])AS [C4]
    FROM(SELECT
    [Project4]。[C1] AS [C1],
    [Project4]。[C2] AS [C2],
    (SELECT
    MAX([Project5]。[A1])AS [A1]
    FROM(SELECT
    [Project5]。[DCCurrent] / [Project5]。[CurrentMPP] AS [A1]
    FROM(SELECT
    [Extent9]。[DCStringID] AS [DCStringID],
    [Extent9]。 [DCCurrent] AS [DCCurrent],
    [Extent10]。[ID] AS [ID],
    [Extent10]。[CurrentMPP] AS [CurrentMPP]
    FROM [dbo]。[StringData] AS [Extent9]
    INNER JOIN [dbo]。[DCString] AS [Extent10] ON [Extent9]。[DCStringID] = [Extent10] [DCDistributionBoxID] = [Extent11]。[ID]
    INNER JOIN [dbo]。[DataLogger] AS [ Extent12] ON [Extent11] [DataLoggerID] = [Extent12]。[ID]
    WHERE(([Extent12])[ProjectID] = @ p__linq__0)OR(([Extent12])[ProjectID] IS NULL)AND (@ p__linq__0 IS NULL)))AND([Extent9]。[TimeStamp]> = @ p__linq__1)AND([Extent9]。[TimeStamp] @ p__linq__2)AND(([Project4])[C1] =(DATEADD(分钟,((DATEDIFF(分钟,@ p__linq__4,[Extent9]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3))) OR(([Project4])[C1] IS NULL)AND(DATEADD(分钟,((DATEDIFF(分钟,@ p__linq__4,[Extent9]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3) ))
    )AS [Project5]
    )AS [Project5])AS [C3]
    FROM(SELECT
    [Project2]。[C1] AS [C1],
    (SELECT
    MIN([Project3]。[A1])AS [A1]
    FROM(SELECT
    [Project3]。[DCCurrent] / [Project3]。[CurrentMPP] AS [A1]
    FROM(SELECT
    [Extent5]。[DCStringID] AS [DCStringID],
    [Extent5]。[DCCurrent] AS [DCCurrent],
    [Extent6]。 [ID] AS [ID],
    [Extent6]。[CurrentMPP ] AS [CurrentMPP]
    FROM [dbo]。[StringData] AS [Extent5]
    INNER JOIN [dbo]。[DCString] AS [Extent6] ON [Extent5]。[DCStringID] = [Extent6] 。[ID]
    INNER JOIN [dbo]。[DCDistributionBox] AS [Extent7] ON [Extent6]。[DCDistributionBoxID] = [Extent7]。[ID]
    INNER JOIN [dbo]。[DataLogger] AS [Extent8] ON [Extent7]。[DataLoggerID] = [Extent8]。[ID]
    WHERE(([Extent8]。[ProjectID] = @ p__linq__0)OR(([Extent8]] [ProjectID] )AND(@ p__linq__0 IS NULL)))AND([Extent5]。[TimeStamp]> = @ p__linq__1)AND([Extent5]。[TimeStamp] @ p__linq__2)AND(([Project2])[C1] =(DATEADD(分钟,((DATEDIFF(分钟,@ p__linq__4,[Extent5]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3) OR(([Project2]。[C1] IS NULL)AND(DATEADD(minute,((DATEDIFF(minute,@ p__linq__4,[Extent5]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3) ))
    )AS [Project3]
    )AS [Project3])AS [C2]
    FROM(SELECT
    [Distinct1]。[C1] AS [C1]
    FROM(SELECT DISTINCT
    DATEADD(minute,((DATEDIFF(minute,@ p__linq__4,[Extent1]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3)AS [C1]
    FROM [dbo]。[StringData] AS [Extent1]
    INNER JOIN [dbo]。[DCString] AS [Extent2] ON [Extent1]。[DCStringID] = [Extent2]。[ID]
    INNER JOIN [dbo]。[DCDistributionBox] AS [Extent3] ON [Extent2]。[DCDistributionBoxID ] = [Extent3]。[ID]
    INNER JOIN [dbo]。[DataLogger] AS [Extent4] ON [Extent3]。[DataLoggerID] = [Extent4]。[ID]
    WHERE(([ Extent4]。[ProjectID] = @ p__linq__0)OR(([Extent4]。[ProjectID] IS NULL)AND(@ p__linq__0 IS NULL)))AND([Extent1]。[TimeStamp]> = @ p__linq__1)AND([ Extent1]。[TimeStamp] @ p__linq__2)
    )AS [Distinct1]
    )AS [Project2]
    )AS [Project4]
    )AS [Project6]
    )AS [Project8]
    )AS [Project10]



    问题



    为什么Entity Framework将每个聚合分离成一个sngle子查询,我如何避免这样做来获得原始SQL查询的性能?



    更新1



    这具有完全相同的SQL查询输出和超时结果:

      var query = from d in model.StringDatas 
    其中d.DCString.DCDistributionBox.DataLogger.ProjectID == projectID
    其中d.TimeStamp> = fromDate
    其中d.TimeStamp< untilDate
    group d by DbFunctions.AddMinutes(DateTime.MinValue,DbFunctions.DiffMinutes(DateTime.MinValue,d.TimeStamp)/ minuteInterval * minuteInterval)into g
    select new
    {
    TimeStamp = g.Key,
    DCCurrentMin = g.Min(v => v.DCCurrent / v.DCString.CurrentMPP),
    DCCurrentMax = g.Max(v => v.DCCurrent / v .DCString.CurrentMPP),
    DCCurrentAvg = g.Average(v => v.DCCurrent / v.DCString.CurrentMPP),
    DCCurrentStDev = DbFunctions.StandardDeviation(g.Select(v => v .DCCurrent / v.DCString.CurrentMPP))
    };

    var queryResult = query.ToList();


    解决方案

    我注意到行为(bug?)回答如何我是否获得EF6以生成包含多个聚合列的高效SQL?



    我能够解决的唯一方法是在组之前引入操作,同样适用于您的情况:

      var query = 
    from e in d in db.StringDatas.AsNoTracking()
    其中d.DCString.DCDistributionBox.DataLogger.ProjectID == projectID
    && d.TimeStamp> = fromDate&& d.TimeStamp< untilDate
    选择新的{d,s = d.DCString})
    组e由DbFunctions.AddMinutes(DateTime.MinValue,DbFunctions.DiffMinutes(DateTime.MinValue,edTimeStamp)/ minuteInterval * minuteInterval)转换为g
    let ratio = g.Select(e => edDCCurrent / esCurrentMPP)
    选择新
    {
    TimeStamp = g.Key,
    DCCurrentMin = ratio。 Min(),
    DCCurrentMax = ratio.Max(),
    DCCurrentAvg = ratio.Average(),
    DCCurrentStDev = DbFunctions。 StandardDeviation(ratio)
    };

    EF生成的SQL:

      SELECT 
    1 AS [C1],
    [GroupBy1]。[K1] AS [C2],
    [GroupBy1]。[A1] AS [C3] $ [$] $ [$] $ [$] $ [$] $ [$] $ [$] $ [b] b FROM(SELECT
    [Project1]。[K1] AS [K1],
    MIN([Project1]。[A1])AS [A1],
    MAX([Project1] A2])AS [A2],
    AVG([Project1]。[A3])AS [A3],
    STDEV([Project1]。[A4])AS [A4]
    FROM (SELECT
    DATEADD(分钟,((DATEDIFF(分钟,@ p__linq__4,[Project1]。[TimeStamp]))/ @ p__linq__5)* @ p__linq__6,@ p__linq__3)AS [K1],
    [Project1 ] [DCCurrent] / [Project1]。[CurrentMPP] AS [A1],
    [Project1]。[DCCurrent] / [Project1]。[CurrentMPP] AS [A2],
    [Project1]。 [DCCurrent] / [Project1]。[CurrentMPP] AS [A3],
    [Project1]。[DCCurrent] / [Project1]。[CurrentMPP] AS [A4]
    FROM(SELECT
    [Extent1]。[TimeStamp] AS [TimeStamp],
    [Extent1]。[DCCurrent] AS [DCCurrent],
    [Extent2]。[CurrentMPP] AS [CurrentMPP] $ [$] $ [$] $ [$] $ [$] $ [$] $ [$] $ [$] $ [$] $ [$] $ [$] b $ b INNER JOIN [dbo]。[DCDistributionBoxes] AS [Extent3] ON [Extent2]。[DCDistributionBoxID] = [Extent3]。[ID]
    INNER JOIN [dbo]。[DataLoggers] AS [Extent4] ON [Extent3]。[DataLoggerID] = [Extent4]。[ID]
    WHERE([Extent4]。[ProjectID] = @ p__linq__0)AND([Extent1]。[TimeStamp]> = @ p__linq__1)AND Extent1]。[TimeStamp] @ p__linq__2)
    )AS [Project1]
    )AS [Project1]
    GROUP BY [K1]
    )AS [GroupBy1]


    I have a question about entity framework query building.

    Schema

    I have a table structure like this:

    CREATE TABLE [dbo].[DataLogger](
        [ID] [bigint] IDENTITY(1,1) NOT NULL,
        [ProjectID] [bigint] NULL,
        CONSTRAINT [PrimaryKey1] PRIMARY KEY CLUSTERED ( [ID] ASC )
    )
    
    CREATE TABLE [dbo].[DCDistributionBox](
        [ID] [bigint] IDENTITY(1,1) NOT NULL,
        [DataLoggerID] [bigint] NOT NULL,
        CONSTRAINT [PrimaryKey2] PRIMARY KEY CLUSTERED ( [ID] ASC )
    )
    
    ALTER TABLE [dbo].[DCDistributionBox]
        ADD CONSTRAINT [FK_DCDistributionBox_DataLogger] FOREIGN KEY([DataLoggerID])
        REFERENCES [dbo].[DataLogger] ([ID])
    
    CREATE TABLE [dbo].[DCString] (
        [ID] [bigint] IDENTITY(1,1) NOT NULL,
        [DCDistributionBoxID] [bigint] NOT NULL,
        [CurrentMPP] [decimal](18, 2) NULL,
        CONSTRAINT [PrimaryKey3] PRIMARY KEY CLUSTERED ( [ID] ASC )
    )
    
    ALTER TABLE [dbo].[DCString]
        ADD CONSTRAINT [FK_DCString_DCDistributionBox] FOREIGN KEY([DCDistributionBoxID])
        REFERENCES [dbo].[DCDistributionBox] ([ID])
    
    CREATE TABLE [dbo].[StringData](
        [DCStringID] [bigint] NOT NULL,
        [TimeStamp] [datetime] NOT NULL,
        [DCCurrent] [decimal](18, 2) NULL,
        CONSTRAINT [PrimaryKey4] PRIMARY KEY CLUSTERED ( [TimeStamp] DESC, [DCStringID] ASC)
    )
    

    The [StringData] table as has following storage stats:

    • Data space: 26,901.86 MB
    • Row count: 131,827,749
    • Partitioned: true
    • Partition count: 62

    Usage

    I now want to group the data in the [StringData] table and do some aggregation.

    In pure SQL it would look like this:

    declare @projectID bigint = 20686;
    declare @from datetime = '06.02.2016';
    declare @till datetime = '07.02.2016';
    declare @interval int = 15;
    
    SELECT 
    DATEADD(MINUTE, DATEDIFF(MINUTE, 0, [StringData].[TimeStamp] ) / @interval * @interval, 0) AS [TimeStamp]
    , AVG([StringData].[DCCurrent] / [DCString].[CurrentMPP]) AS [DCCurrentAvg]
    , MIN([StringData].[DCCurrent] / [DCString].[CurrentMPP]) AS [DCCurrentMin]
    , MAX([StringData].[DCCurrent] / [DCString].[CurrentMPP]) AS [DCCurrentMax]
    , STDEV([StringData].[DCCurrent] / [DCString].[CurrentMPP]) AS [DCCurrentStDev]
    , COUNT(*) AS [Count]
    
    FROM [StringData]
    JOIN [DCString] ON [DCString].[ID] = [StringData].[DCStringID]
    JOIN [DCDistributionBox] ON [DCDistributionBox].[ID] = [DCString].[DCDistributionBoxID]
    JOIN [DataLogger] ON [DataLogger].[ID] = [DCDistributionBox].[DataLoggerID]
    
    WHERE [DataLogger].[ProjectID] = @projectID
    AND [StringData].[TimeStamp] >= @from
    AND [StringData].[TimeStamp] < @till
    
    GROUP BY DATEADD(MINUTE, DATEDIFF(MINUTE, 0, [StringData].[TimeStamp] ) / @interval * @interval, 0)
    

    Excecution timespan: 653ms

    Now I created a Entity Framework equivalent (I thought):

    var compareData = model.StringDatas
        AsNoTracking()
        .Where(p => p.DCString.DCDistributionBox.DataLogger.ProjectID == projectID && p.TimeStamp >= from && p.TimeStamp < till)                    
        .Select(d => new
        {
            TimeStamp = d.Key,
            DCCurrentMin = d.Min(v => v.DCCurrent / v.DCString.CurrentMPP),
            DCCurrentMax = d.Max(v => v.DCCurrent / v.DCString.CurrentMPP),
            DCCurrentAvg = d.Average(v => v.DCCurrent / v.DCString.CurrentMPP),
            DCCurrentStDev = DbFunctions.StandardDeviation(d.Select(v => v.DCCurrent / v.DCString.CurrentMPP))
        })
        .ToList();
    

    And the result of excecution was a timeout (longer the 30 seconds)!?

    Attempts

    I now took a look into the Entity Framework generated SQL query and looks like this:

    SELECT 
    1 AS [C1], 
    [Project10].[C1] AS [C2], 
    [Project10].[C2] AS [C3], 
    [Project10].[C3] AS [C4], 
    [Project10].[C4] AS [C5], 
    [Project10].[C5] AS [C6]
    FROM ( SELECT 
        [Project8].[C1] AS [C1], 
        [Project8].[C2] AS [C2], 
        [Project8].[C3] AS [C3], 
        [Project8].[C4] AS [C4], 
        (SELECT 
            STDEV([Project9].[A1]) AS [A1]
            FROM ( SELECT 
                [Project9].[DCCurrent] / [Project9].[CurrentMPP] AS [A1]
                FROM ( SELECT 
                    [Extent17].[DCStringID] AS [DCStringID], 
                    [Extent17].[DCCurrent] AS [DCCurrent], 
                    [Extent18].[ID] AS [ID], 
                    [Extent18].[CurrentMPP] AS [CurrentMPP]
                    FROM    [dbo].[StringData] AS [Extent17]
                    INNER JOIN [dbo].[DCString] AS [Extent18] ON [Extent17].[DCStringID] = [Extent18].[ID]
                    INNER JOIN [dbo].[DCDistributionBox] AS [Extent19] ON [Extent18].[DCDistributionBoxID] = [Extent19].[ID]
                    INNER JOIN [dbo].[DataLogger] AS [Extent20] ON [Extent19].[DataLoggerID] = [Extent20].[ID]
                    WHERE (([Extent20].[ProjectID] = @p__linq__0) OR (([Extent20].[ProjectID] IS NULL) AND (@p__linq__0 IS NULL))) AND ([Extent17].[TimeStamp] >= @p__linq__1) AND ([Extent17].[TimeStamp] < @p__linq__2) AND (([Project8].[C1] = (DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Extent17].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3))) OR (([Project8].[C1] IS NULL) AND (DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Extent17].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) IS NULL)))
                )  AS [Project9]
            )  AS [Project9]) AS [C5]
        FROM ( SELECT 
            [Project6].[C1] AS [C1], 
            [Project6].[C2] AS [C2], 
            [Project6].[C3] AS [C3], 
            (SELECT 
                AVG([Project7].[A1]) AS [A1]
                FROM ( SELECT 
                    [Project7].[DCCurrent] / [Project7].[CurrentMPP] AS [A1]
                    FROM ( SELECT 
                        [Extent13].[DCStringID] AS [DCStringID], 
                        [Extent13].[DCCurrent] AS [DCCurrent], 
                        [Extent14].[ID] AS [ID], 
                        [Extent14].[CurrentMPP] AS [CurrentMPP]
                        FROM    [dbo].[StringData] AS [Extent13]
                        INNER JOIN [dbo].[DCString] AS [Extent14] ON [Extent13].[DCStringID] = [Extent14].[ID]
                        INNER JOIN [dbo].[DCDistributionBox] AS [Extent15] ON [Extent14].[DCDistributionBoxID] = [Extent15].[ID]
                        INNER JOIN [dbo].[DataLogger] AS [Extent16] ON [Extent15].[DataLoggerID] = [Extent16].[ID]
                        WHERE (([Extent16].[ProjectID] = @p__linq__0) OR (([Extent16].[ProjectID] IS NULL) AND (@p__linq__0 IS NULL))) AND ([Extent13].[TimeStamp] >= @p__linq__1) AND ([Extent13].[TimeStamp] < @p__linq__2) AND (([Project6].[C1] = (DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Extent13].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3))) OR (([Project6].[C1] IS NULL) AND (DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Extent13].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) IS NULL)))
                    )  AS [Project7]
                )  AS [Project7]) AS [C4]
            FROM ( SELECT 
                [Project4].[C1] AS [C1], 
                [Project4].[C2] AS [C2], 
                (SELECT 
                    MAX([Project5].[A1]) AS [A1]
                    FROM ( SELECT 
                        [Project5].[DCCurrent] / [Project5].[CurrentMPP] AS [A1]
                        FROM ( SELECT 
                            [Extent9].[DCStringID] AS [DCStringID], 
                            [Extent9].[DCCurrent] AS [DCCurrent], 
                            [Extent10].[ID] AS [ID], 
                            [Extent10].[CurrentMPP] AS [CurrentMPP]
                            FROM    [dbo].[StringData] AS [Extent9]
                            INNER JOIN [dbo].[DCString] AS [Extent10] ON [Extent9].[DCStringID] = [Extent10].[ID]
                            INNER JOIN [dbo].[DCDistributionBox] AS [Extent11] ON [Extent10].[DCDistributionBoxID] = [Extent11].[ID]
                            INNER JOIN [dbo].[DataLogger] AS [Extent12] ON [Extent11].[DataLoggerID] = [Extent12].[ID]
                            WHERE (([Extent12].[ProjectID] = @p__linq__0) OR (([Extent12].[ProjectID] IS NULL) AND (@p__linq__0 IS NULL))) AND ([Extent9].[TimeStamp] >= @p__linq__1) AND ([Extent9].[TimeStamp] < @p__linq__2) AND (([Project4].[C1] = (DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Extent9].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3))) OR (([Project4].[C1] IS NULL) AND (DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Extent9].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) IS NULL)))
                        )  AS [Project5]
                    )  AS [Project5]) AS [C3]
                FROM ( SELECT 
                    [Project2].[C1] AS [C1], 
                    (SELECT 
                        MIN([Project3].[A1]) AS [A1]
                        FROM ( SELECT 
                            [Project3].[DCCurrent] / [Project3].[CurrentMPP] AS [A1]
                            FROM ( SELECT 
                                [Extent5].[DCStringID] AS [DCStringID], 
                                [Extent5].[DCCurrent] AS [DCCurrent], 
                                [Extent6].[ID] AS [ID], 
                                [Extent6].[CurrentMPP] AS [CurrentMPP]
                                FROM    [dbo].[StringData] AS [Extent5]
                                INNER JOIN [dbo].[DCString] AS [Extent6] ON [Extent5].[DCStringID] = [Extent6].[ID]
                                INNER JOIN [dbo].[DCDistributionBox] AS [Extent7] ON [Extent6].[DCDistributionBoxID] = [Extent7].[ID]
                                INNER JOIN [dbo].[DataLogger] AS [Extent8] ON [Extent7].[DataLoggerID] = [Extent8].[ID]
                                WHERE (([Extent8].[ProjectID] = @p__linq__0) OR (([Extent8].[ProjectID] IS NULL) AND (@p__linq__0 IS NULL))) AND ([Extent5].[TimeStamp] >= @p__linq__1) AND ([Extent5].[TimeStamp] < @p__linq__2) AND (([Project2].[C1] = (DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Extent5].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3))) OR (([Project2].[C1] IS NULL) AND (DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Extent5].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) IS NULL)))
                            )  AS [Project3]
                        )  AS [Project3]) AS [C2]
                    FROM ( SELECT 
                        [Distinct1].[C1] AS [C1]
                        FROM ( SELECT DISTINCT 
                            DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Extent1].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) AS [C1]
                            FROM    [dbo].[StringData] AS [Extent1]
                            INNER JOIN [dbo].[DCString] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]
                            INNER JOIN [dbo].[DCDistributionBox] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]
                            INNER JOIN [dbo].[DataLogger] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]
                            WHERE (([Extent4].[ProjectID] = @p__linq__0) OR (([Extent4].[ProjectID] IS NULL) AND (@p__linq__0 IS NULL))) AND ([Extent1].[TimeStamp] >= @p__linq__1) AND ([Extent1].[TimeStamp] < @p__linq__2)
                        )  AS [Distinct1]
                    )  AS [Project2]
                )  AS [Project4]
            )  AS [Project6]
        )  AS [Project8]
    )  AS [Project10]
    

    Question

    Why does Entity Framework seperates every aggregation into a sngle subselect and how can I avoid this to get a performance near to the raw SQL query?

    Update 1

    This has exact the same SQL query output and timeout result:

    var query = from d in model.StringDatas
            where d.DCString.DCDistributionBox.DataLogger.ProjectID == projectID
            where d.TimeStamp >= fromDate
            where d.TimeStamp < tillDate
            group d by DbFunctions.AddMinutes(DateTime.MinValue, DbFunctions.DiffMinutes(DateTime.MinValue, d.TimeStamp) / minuteInterval * minuteInterval) into g
            select new
            {
                TimeStamp = g.Key,
                DCCurrentMin = g.Min(v => v.DCCurrent / v.DCString.CurrentMPP),
                DCCurrentMax = g.Max(v => v.DCCurrent / v.DCString.CurrentMPP),
                DCCurrentAvg = g.Average(v => v.DCCurrent / v.DCString.CurrentMPP),
                DCCurrentStDev = DbFunctions.StandardDeviation(g.Select(v => v.DCCurrent / v.DCString.CurrentMPP))
            };
    
    var queryResult= query.ToList();
    

    解决方案

    I've noticed that behavior (bug?) when answering How do I get EF6 to generate efficient SQL containing mulitple aggregate columns?.

    The only way I was able to resolve it was by introducing a temporary projection before the group by operation, and the same applies to your case:

    var query =
        from e in (from d in db.StringDatas.AsNoTracking()
                   where d.DCString.DCDistributionBox.DataLogger.ProjectID == projectID
                       && d.TimeStamp >= fromDate && d.TimeStamp < tillDate
                   select new { d, s = d.DCString })
        group e by DbFunctions.AddMinutes(DateTime.MinValue, DbFunctions.DiffMinutes(DateTime.MinValue, e.d.TimeStamp) / minuteInterval * minuteInterval) into g
        let ratio = g.Select(e => e.d.DCCurrent / e.s.CurrentMPP)
        select new
        {
            TimeStamp = g.Key,
            DCCurrentMin = ratio.Min(),
            DCCurrentMax = ratio.Max(),
            DCCurrentAvg = ratio.Average(),
            DCCurrentStDev = DbFunctions.StandardDeviation(ratio)
        };
    

    EF generated SQL:

    SELECT 
        1 AS [C1], 
        [GroupBy1].[K1] AS [C2], 
        [GroupBy1].[A1] AS [C3], 
        [GroupBy1].[A2] AS [C4], 
        [GroupBy1].[A3] AS [C5], 
        [GroupBy1].[A4] AS [C6]
        FROM ( SELECT 
            [Project1].[K1] AS [K1], 
            MIN([Project1].[A1]) AS [A1], 
            MAX([Project1].[A2]) AS [A2], 
            AVG([Project1].[A3]) AS [A3], 
            STDEV([Project1].[A4]) AS [A4]
            FROM ( SELECT 
                DATEADD (minute, ((DATEDIFF (minute, @p__linq__4, [Project1].[TimeStamp])) / @p__linq__5) * @p__linq__6, @p__linq__3) AS [K1], 
                [Project1].[DCCurrent] / [Project1].[CurrentMPP] AS [A1], 
                [Project1].[DCCurrent] / [Project1].[CurrentMPP] AS [A2], 
                [Project1].[DCCurrent] / [Project1].[CurrentMPP] AS [A3], 
                [Project1].[DCCurrent] / [Project1].[CurrentMPP] AS [A4]
                FROM ( SELECT 
                    [Extent1].[TimeStamp] AS [TimeStamp], 
                    [Extent1].[DCCurrent] AS [DCCurrent], 
                    [Extent2].[CurrentMPP] AS [CurrentMPP]
                    FROM    [dbo].[StringDatas] AS [Extent1]
                    INNER JOIN [dbo].[DCStrings] AS [Extent2] ON [Extent1].[DCStringID] = [Extent2].[ID]
                    INNER JOIN [dbo].[DCDistributionBoxes] AS [Extent3] ON [Extent2].[DCDistributionBoxID] = [Extent3].[ID]
                    INNER JOIN [dbo].[DataLoggers] AS [Extent4] ON [Extent3].[DataLoggerID] = [Extent4].[ID]
                    WHERE ([Extent4].[ProjectID] = @p__linq__0) AND ([Extent1].[TimeStamp] >= @p__linq__1) AND ([Extent1].[TimeStamp] < @p__linq__2)
                )  AS [Project1]
            )  AS [Project1]
            GROUP BY [K1]
        )  AS [GroupBy1]
    

    这篇关于实体框架多重聚合性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆