使用跳过/分页进行分页时,LINQ查询性能极差 [英] EXTREMELY Poor LINQ Query Performance When Using Skip/Take for Paging

查看:104
本文介绍了使用跳过/分页进行分页时,LINQ查询性能极差的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要使用LINQ从DB2数据库查询记录.我有从数据库架构生成的实体,并且正在尝试使用跳过并获取"来执行LINQ查询.基础表有25列,也许有一百万条记录.当我在没有"Skip()"的情况下执行查询时,大约需要0.508毫秒才能完成.当我包含Skip()时,它需要将近30秒.很大的不同.

I need to query records from a DB2 database using LINQ. I have entities that have been generated from the DB schema and am attempting to perform a LINQ query using Skip and Take. The underlying table has like 25 columns and maybe a million records. When I execute the query without the "Skip()" it takes approximately .508 milliseconds to complete. When I include Skip() it takes close to 30 seconds. Big difference.

有人可以告诉我为什么会这样吗?

Can anyone tell me why this is happening?

更新:这是我正在使用的LINQ查询.

UPDATE: Here is the LINQ query I am using.

var x = 30;

var results = context.ASSET_T
.OrderBy(c => c.ASSET_ID)
.Skip(x)
.Take(x)
.ToList();

更新:所以我只是尝试更新查询,以便仅返回ASSET_ID列.当我只返回那一列时,使用Skip()进行的查询仅需.256毫秒.

UPDATE: So I just tried updating the query so that I only return a single column, ASSET_ID. When I only return that one column the query WITH the Skip() only takes .256 milliseconds.

var x = 30;

var results = context.ASSET_T
.OrderBy(c => c.ASSET_ID)
.Skip(x)
.Take(x)
.Select(c => c.ASSET_ID)
.ToList();

如果我包括任何其他列,则查询执行时间会增加 DRAMATICALLY .

If I include any additional columns then the query execution time increases DRAMATICALLY.

例如,下面的查询需要10秒才能执行.

The query below for example takes 10 seconds to execute.

var x = 30;

var results = context.ASSET_T
.OrderBy(c => c.ASSET_ID)
.Skip(x)
.Take(x)
.Select(c => new {
                 ASSET_ID = c.ASSET_ID,
                 ASSET_TYP = c.ASSET_TYP
                 ASSET_DESC = c.ASSET_DESC
                 })
.ToList();

更新:我现在发现我要查询的表中的列存在问题(也许与索引相关).如上文所述,当我执行仅返回ASSET_ID列的查询时,它仅花费.256毫秒.如果我尝试执行 返回ASSET_DESC的查询或 返回ASSET_TYP的查询,则查询执行时间跳到9秒左右.

UPDATE: I have now discovered that there are issues (perhaps index-related) with the columns in the table I am trying to query. As I mentioned above, when I execute a query that only returns the ASSET_ID column it only takes .256 milliseconds. If I try to execute a query that ONLY returns ASSET_DESC or a query that ONLY returns ASSET_TYP then the query execution time jumps to around 9 seconds.

这是否表明那些其他列当前未编制索引?

Would this indicate that those other columns are not currently being indexed?

更新:我已经添加了上述LINQ查询的SQL输出.

UPDATE: I have added the SQL output from the above LINQ query.

SELECT 
Project1.C1 AS C1, 
Project1.ASSET_ID AS ASSET_ID, 
Project1.ASSET_TYP AS ASSET_TYP, 
Project1.ASSET_DESC AS ASSET_DESC
FROM ( SELECT Project1.ASSET_ID AS ASSET_ID, Project1.ASSET_TYP AS ASSET_TYP, Project1.ASSET_DESC AS ASSET_DESC, Project1.C1 AS C1, row_number() OVER (ORDER BY Project1.ASSET_ID ASC, Project1.ASSET_TYP ASC, Project1.ASSET_DESC ASC) AS row_number
  FROM ( SELECT 
    Extent1.ASSET_ID AS ASSET_ID, 
    Extent1.ASSET_TYP AS ASSET_TYP, 
    Extent1.ASSET_DESC AS ASSET_DESC, 
    CAST(1 AS int) AS C1
    FROM MYDB.ASSET_T AS Extent1
  )  AS Project1
)  AS Project1
WHERE Project1.row_number > 1
ORDER BY Project1.ASSET_ID ASC, Project1.ASSET_TYP ASC, Project1.ASSET_DESC ASC FETCH  FIRST 31 ROWS ONLY 

推荐答案

您是否查看了为此查询生成的SQL?

Have you looked at the SQL that get generated for this query?

据我所知Skip()Take()最终会生成一个生成的语句,该语句使用称为Row_Number()的函数.此功能以如下所示的方式在整个记录集上执行-在大记录上获取所需的开始值和结束值之间的值之前,将行号作为结果中的第一个生成的列插入,通常会非常缓慢设置..

As far as I know Skip() Take() eventually results in a generated statement that uses a function called Row_Number(). This function is executed across the entire record set in the fashion shown below - To insert the row number as the first generated column in the result before taking values between the start and end values you desire, typically making it very very slow, on large record sets..

SELECT ...
FROM (
    SELECT ROW_NUMBER() OVER (ORDER BY [t0].[...]) AS [ROW_NUMBER], ... ,
    FROM [table] AS [t0]
    ) AS [t1]
WHERE [t1].[ROW_NUMBER] BETWEEN @p0 + 1 AND @p0 + @p1
ORDER BY [t1].[ROW_NUMBER]

如果您可以使用索引的数字列并对其进行排列,以便自己读取> = start_value AND< =最终值,然后将这些值上移您的分页数量,它将使用索引并以毫秒为单位返回结果

If you can use an indexed numeric column and arrange it so that you read >= start_value AND <= end-value yourself, then move those values up by your paging amount it will use the index and return results in in milliseconds.

我拥有索引良好的数据库,其中包含100亿条记录,而Skip().Take()可能需要30分钟才能获得25条记录.直接读取在哪里大约需要20-40毫秒.

I have well indexed databases with 100's millions of records and Skip().Take() can take up to 30 minutes to obtain 25 records. Where are the direct read takes around 20-40ms.

这意味着您将不得不考虑实现分页的编码方式,并且在您的情况下可能不可行.

It would mean you would have to think about the way you code to achieve paging, and may not be practicable to implement in your case.

这篇关于使用跳过/分页进行分页时,LINQ查询性能极差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆