随机选择并不总是返回单行 [英] Random select is not always returning a single row

查看:42
本文介绍了随机选择并不总是返回单行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下(简化)代码片段的目的是返回随机一行.不幸的是,当我们在查询分析器中运行这个片段时,它返回零到三个结果.

The intention of following (simplified) code fragment is to return one random row. Unfortunatly, when we run this fragment in the query analyzer, it returns between zero and three results.

由于我们的输入表正好由 5 行和唯一 ID 组成,当我们在这个表上执行选择时 ID 等于一个随机数,我们很难过会有不止一行回.

As our input table consists of exactly 5 rows with unique ID's and as we perform a select on this table where ID equals a random number, we are stumped that there would ever be more than one row returned.

注意:除此之外,我们已经尝试将校验和结果转换为整数但无济于事.

DECLARE @Table TABLE (
  ID INTEGER IDENTITY (1, 1)
  , FK1 INTEGER
)

INSERT INTO @Table
SELECT 1
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5

SELECT  *
FROM    @Table 
WHERE   ID = ABS(CHECKSUM(NEWID())) % 5 + 1

编辑

我们的使用场景如下(请不要评论这样做是否正确.这是权力决定的)

Edit

Our usage scenario is as follows (please don't comment on wether it is the right thing to do or not. It's the powers that be that have decided)

最终,我们必须创建一个具有现实值的结果,其中生产者和权重的组合被混淆,方法是从表本身中随机选择现有权重.
然后查询会变成这样(这也是不能使用 RAND 的原因)

Ultimately, we must create a result with realistic values where the combination of producer and weights are obfuscated by selecting at random existing weights from the table itself.
The query then would become something like this (also a reason why RAND can not be used)

SELECT  t.ID
        , FK1 = (SELECT FK1 FROM @Table WHERE ID=ABS(CHECKSUM(NEWID())) % 5 + 1)
FROM    @Table t

因为内部选择可能返回零结果,它会返回一个 NULL 值,这也是不可接受的.就是为了探究为什么内部选择返回0到x之间的结果,这个问题引发了(这甚至是英文?).

Because the inner select could be returning zero results, it would return a NULL value wich again is not acceptable. It is the investigation of why the inner select returns between zero and x results, that this question sproused (is this even English?).

让我亮起灯的是一个简单的观察结果:ABS(CHECKSUM(NEWID())) % 5 + 1) 被重新评估为每一行.我的印象是 ABS(CHECKSUM(NEWID())) % 5 + 1) 会被评估一次,然后匹配.

What turned the light on for me was the simple observation that ABS(CHECKSUM(NEWID())) % 5 + 1) was re-evaluated for each row. I was under the impression that ABS(CHECKSUM(NEWID())) % 5 + 1) would get evaluated once, then matched.

感谢大家的回答,慢慢但肯定会引导我更好地理解.

Thank you all for answering and slowly but surely leading me to a better understanding.

推荐答案

发生这种情况的原因是 NEWID() 为表中的每一行生成不同的值.对于每一行,独立于其他行,有五分之一的机会被返回.因此,就目前而言,您实际上有 3125 分之一的机会返回所有 5 行!

The reason this happens is because NEWID() gies a different value for each row in the table. For each row, independently of the others, there is a one in five chance of it being returned. Consequently, as it stands, you actually have a 1 in 3125 chance of all 5 rows being returned!

要查看此内容,请运行以下查询.您会看到每一行都有不同的 ID.

To see this, run the following query. You'll see that each row gets a different ID.

SELECT  * , NEWID()
FROM    @Table  

这将修复您的代码:

DECLARE @Id int
SET @Id = ABS(CHECKSUM(NEWID())) % 5 + 1

SELECT  * 
FROM    @Table  
WHERE   ID = @Id

但是,我不确定这是从表中随机选择一行的最有效方法.

However, I'm not sure this is the most efficient method of selecting a single random row from the table.

您可能会发现这篇 MSDN 文章很有用:http://msdn.microsoft.com/en-us/library/Aa175776(T-SQL 中的随机采样)

You might find this MSDN article useful: http://msdn.microsoft.com/en-us/library/Aa175776 (Random Sampling in T-SQL)

EDIT 1:现在我想,这可能是最有效的方法,假设行数保持固定并且 ID 保证是连续的.

EDIT 1: now I think about it, this probably is the most efficient way to do it, assuming the number of rows remains fixed and the IDs are guaranteed to be contiguous.

EDIT 2:为了在用作子查询时达到预期的结果,像这样使用 TOP 1:

EDIT 2: to achieve the desired result when used as a sub-query, use TOP 1 like this:

SELECT  t.ID 
        , FK1 = (SELECT TOP 1 FK1 FROM @Table ORDER BY NEWID()) 
FROM    @Table t

这篇关于随机选择并不总是返回单行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆