高效查询每个唯一ID的前N行 [英] Efficient query for only the first N rows for each unique ID
问题描述
这是的后续内容这个问题.
TLDR:
问题:
我想过滤查询以仅保留每个唯一ID的前n行.
I want to filter a query to only keep the first n rows for each unique ID.
答案:
query = query.GroupBy(q => q.ID).SelectMany(g => g.Take(n));
此答案的问题是,对于80,000多行,对查询的评估比通过迭代过滤(foreach
)花费的时间长得多(至少慢两倍).查看此答案生成的SQL,使用CROSS APPLY
,最有可能使用SelectMany()
.
The problem with this answer is that for 80,000+ rows, evaluating the query takes much longer than filtering by iteration (foreach
) (at least twice as slow). Looking at the SQL generated by this answer, a CROSS APPLY
is used, most likely for the SelectMany()
.
此链接描述CROSS APPLY
的作用:
APPLY运算符允许您联接两个表表达式;每次对左表表达式中的每一行都处理右表表达式.
The APPLY operator allows you to join two table expressions; the right table expression is processed every time for each row from the left table expression.
简而言之,我正在寻找一个过滤查询,该查询可以有效地收集每个唯一ID
的前N
行.
In short, I'm looking for a filtering query which efficiently gathers the top N
rows for each unique ID
.
具有说明性SQL的Linq解决方案将是理想的选择.
A Linq solution with explained SQL would be ideal.
推荐答案
I found my answer in SQL here (SQL 2000 Solution at the bottom) and managed to implement a Queryable/Linq version:
query = tableQueryable.Where(a =>
tableQueryable.Where(b => b.ID == a.ID)
.OrderByDescending(o => o.Timestamp)
.Take(N)
.Select(s => s.PK)
.Contains(a.PK)
).OrderByDescending(d => d.Timestamp);
一个相当标准的子查询"模式.在大桌子上,速度要快得多.
A fairly standard "sub-query" pattern. It's much faster on a large table.
这篇关于高效查询每个唯一ID的前N行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!