使用LINQ(ala TABLESAMPLE)从大型结果集中有效地选择随机行 [英] Efficiently select random rows from large resultset with LINQ (ala TABLESAMPLE)
问题描述
我想从一个非常大的表(数百万行)的复杂查询结果中选择一些随机行.
I want to select a handful of random rows from the results of a complex query on a very large table (many millions of rows).
我正在使用SQL Server 2008,有效地执行此操作的正确方法似乎是
I am using SQL Server 2008, and the proper way to do this efficiently seems to be the TABLESAMPLE clause.
注1:我对流行的按NEWID()排序"解决方案不感兴趣-对于大型表效率不高.
Note 1: I am not interested in the popular "order by NEWID()" solution - it is inefficient for large tables.
注意2:由于查询很复杂,因此,如果可能的话,我不必先计算COUNT.
Note 2: Since my query is complex, I do not want to have to first calculate the COUNT over it, if possible.
Note 3: Since the resultset is huge, I do not want to have to traverse it myself, such as is suggested here.
更重要的是,我正在使用LINQ.具体来说,就是LINQ-To-Entities.
The kicker is that I am using LINQ. Specifically, LINQ-To-Entities.
是否有使用LINUX友好的方式来使用TABLESAMPLE?
Is there a LINQ-friendly way to use TABLESAMPLE?
即使没有直接支持,也可以通过某种方式用LINQ编写大部分查询,然后执行少量手动SQL来执行TABLESAMPLE吗?
Even if there is no direct support, is there some way I can write most of my query in LINQ and then do a small amount of manual SQL to perform the TABLESAMPLE?
推荐答案
似乎我想完成的事情一开始都是不可能的.
It seems that what I want to accomplish is not even possible in the first place.
TABLESAMPLE不能用于派生表,因此使用复杂的查询生成大结果集然后使用TABLESAMPLE进行随机抽样甚至是不可行的.
TABLESAMPLE cannot be used on derived tables, so it is not even feasible to have a complex query generating a large result set and then get a random sampling with TABLESAMPLE.
TABLESAMPLE只能在联接和后续操作之前在查询的基表上使用. (请参见文档)
TABLESAMPLE is only something that can be used on the base tables that go into a query, before joins and soforth. (see documentation)
此MSDN链接描述了一种获取随机百分比的方法可以有效地获得结果,因此,大致执行我想要的操作的最佳方法可能是在视图中使用该视图,并根据该视图构建我的LINQ.
This MSDN link describes a way to get a random percentage of results efficiently, so the best way to do approximately what I want may be to use that in a view, and build my LINQ off of that view.
谢谢大家的投入.
这篇关于使用LINQ(ala TABLESAMPLE)从大型结果集中有效地选择随机行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!