使用LINQ(ala TABLESAMPLE)从大型结果集中有效地选择随机行 [英] Efficiently select random rows from large resultset with LINQ (ala TABLESAMPLE)

查看:60
本文介绍了使用LINQ(ala TABLESAMPLE)从大型结果集中有效地选择随机行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从一个非常大的表(数百万行)的复杂查询结果中选择一些随机行.

I want to select a handful of random rows from the results of a complex query on a very large table (many millions of rows).

我正在使用SQL Server 2008,有效地执行此操作的正确方法似乎是

I am using SQL Server 2008, and the proper way to do this efficiently seems to be the TABLESAMPLE clause.

注1:我对流行的按NEWID()排序"解决方案不感兴趣-对于大型表效率不高.

Note 1: I am not interested in the popular "order by NEWID()" solution - it is inefficient for large tables.

注意2:由于查询很复杂,因此,如果可能的话,我不必先计算COUNT.

Note 2: Since my query is complex, I do not want to have to first calculate the COUNT over it, if possible.

注3:由于结果集很大,因此我不想自己遍历它,如建议的

Note 3: Since the resultset is huge, I do not want to have to traverse it myself, such as is suggested here.

更重要的是,我正在使用LINQ.具体来说,就是LINQ-To-Entities.

The kicker is that I am using LINQ. Specifically, LINQ-To-Entities.

是否有使用LINUX友好的方式来使用TABLESAMPLE?

Is there a LINQ-friendly way to use TABLESAMPLE?

即使没有直接支持,也可以通过某种方式用LINQ编写大部分查询,然后执行少量手动SQL来执行TABLESAMPLE吗?

Even if there is no direct support, is there some way I can write most of my query in LINQ and then do a small amount of manual SQL to perform the TABLESAMPLE?

推荐答案

似乎我想完成的事情一开始都是不可能的.

It seems that what I want to accomplish is not even possible in the first place.

TABLESAMPLE不能用于派生表,因此使用复杂的查询生成大结果集然后使用TABLESAMPLE进行随机抽样甚至是不可行的.

TABLESAMPLE cannot be used on derived tables, so it is not even feasible to have a complex query generating a large result set and then get a random sampling with TABLESAMPLE.

TABLESAMPLE只能在联接和后续操作之前在查询的基表上使用. (请参见文档)

TABLESAMPLE is only something that can be used on the base tables that go into a query, before joins and soforth. (see documentation)

此MSDN链接描述了一种获取随机百分比的方法可以有效地获得结果,因此,大致执行我想要的操作的最佳方法可能是在视图中使用该视图,并根据该视图构建我的LINQ.

This MSDN link describes a way to get a random percentage of results efficiently, so the best way to do approximately what I want may be to use that in a view, and build my LINQ off of that view.

谢谢大家的投入.

这篇关于使用LINQ(ala TABLESAMPLE)从大型结果集中有效地选择随机行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆