通过分页优化SQL查询 [英] Optimize SQL query with pagination
问题描述
我有一个针对SQL Server数据库的查询,该查询需要10秒钟以上的时间才能执行.正在查询的表有超过1400万行.
I have a query running against a SQL Server database that is taking over 10 seconds to execute. The table being queried has over 14 million rows.
我想按日期顺序按给定的 ServiceUserId
在 Notes
表中显示 Text
列.可能有成千上万的条目,所以我想将返回值限制在可管理的水平.
I want to display the Text
column from a Notes
table by a given ServiceUserId
in date order. There could be thousands of entries so I want to limit the returned values to a manageable level.
SELECT Text
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY [DateDone]) AS RowNum, Text
FROM
Notes
WHERE
ServiceUserId = '6D33B91A-1C1D-4C99-998A-4A6B0CC0A6C2') AS RowConstrainedResult
WHERE
RowNum >= 40 AND RowNum < 60
ORDER BY
RowNum
以下是上述查询的执行计划.
Below is the execution plan for the above query.
- 非聚集索引-
ServiceUserId
和DateDone
列上的非聚集索引按升序排列. - 键查找-表的主键是
NoteId
- Nonclustered Index - nonclustered index on the
ServiceUserId
andDateDone
columns in ascending order. - Key lookup - Primary key for the table which is the
NoteId
如果第二次运行相同的查询但行号不同,那么我会以毫秒为单位得到响应,这是基于缓存的执行计划.但是,针对其他 ServiceUserId
运行的查询将需要约10秒的时间.
If I run the same query a second time but with different row numbers then I get a response in milliseconds, I assume from a cached execution plan. The query ran for a different ServiceUserId
will take ~10 seconds though.
关于如何加快此查询的任何建议?
Any suggestions for how to speed up this query?
推荐答案
您应该看一下进入键集分页.
其性能远胜于行集分页.
与之根本不同,它不是引用特定的行编号块,而是引用起始点来查找索引键.
It differs fundamentally from it, in that instead of referencing a particular block of row numbers, instead you reference starting point to lookup the index key.
之所以要快得多,是因为您不必关心特定键之前有多少行,而只是寻找一个键并向前(或向后)移动.
The reason it is much faster is that you don't care about how many rows are before a particular key, you just seek a key and move forward (or backward).
假设您要通过单个 ServiceUserId
进行过滤,并按 DateDone
进行排序.您需要如下所示的索引(如果太大,可以省去 INCLUDE
,它对数学的影响不大):
Say you are filtering by a single ServiceUserId
, ordering by DateDone
. You need an index as follows (you could leave out the INCLUDE
if it's too big, it doesn't change the maths very much):
create index IX_DateDone on Notes (ServiceUserId, DateDone) INCLUDE (TEXT);
现在,当您选择一些行时,不必提供开始和结束的行编号,而是提供开始键:
Now, when you select some rows, instead of giving the start and end row numbers, give the starting key:
SELECT TOP (20)
Text,
DateDone
FROM
Notes
WHERE
ServiceUserId = '6D33B91A-1C1D-4C99-998A-4A6B0CC0A6C2'
AND DateDone > @startingDate
ORDER BY
DateDone;
在下一次运行中,您传递您收到的最后一个 DateDone
值.这样可以得到下一批.
On the next run, you pass the last DateDone
value you received. This gets you the next batch.
一个小的缺点是您不能跳转页面.但是,对于用户来说,跳转到第327页的情况比从用户界面角度考虑的情况要罕见得多.并不重要.
The one small downside is that you cannot jump pages. However, it is much rarer than some may think (from a UI perspective) for a user to want to jump to page 327. So that doesn't really matter.
密钥必须是唯一的.如果密钥不是唯一的,则无法精确找到下一行.如果您需要使用额外的列来保证唯一性,它将变得更加复杂:
The key must be unique. If it is not unique you can't seek to exactly the next row. If you need to use an extra column to guarantee uniqueness, it gets a little more complicated:
WITH NotesFiltered AS
(
SELECT * FROM Notes
WHERE
ServiceUserId = '6D33B91A-1C1D-4C99-998A-4A6B0CC0A6C2'
)
SELECT TOP (20)
Text,
DateDone
FROM (
SELECT
Text,
DateDone,
0 AS ordering
FROM NotesFiltered
WHERE
DateDone = @startingDate AND NoteId > @startingNoteId
UNION ALL
SELECT
Text,
DateDone,
1 AS ordering
FROM NotesFiltered
WHERE
DateDone > @startingDate
) n
ORDER BY
ordering, DateDone, NoteId;
旁注
在支持行值比较的RDBMS中,可以通过编写以下代码将多列示例简化回原始代码:
In RDBMSs that support row-value comparisons, the multi-column example could be simplified back to the original code by writing:
WHERE (DateDone, NoteId) > (@startingDate, @startingNoteId)
很遗憾,SQL Server当前不支持此功能.
请投票给天青对此的反馈请求
Unfortunately SQL Server does not support this currently.
Please vote for the Azure Feedback request for this
这篇关于通过分页优化SQL查询的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!