什么情况导致CHECKSUM_AGG返回0? [英] What conditions cause CHECKSUM_AGG to return 0?

查看:109
本文介绍了什么情况导致CHECKSUM_AGG返回0?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

似乎有很多情况导致 CHECKSUM_AGG 返回0,这是我不期望的。我只能找到一个讨论过的问题,那就是重复的值会导致问题。可以通过 DISTINCT GROUP BY 解决。



<我还发现了另外两种情况,这些情况对我来说意义不大。一个是由我的老板提供的,另一个是我找到的 MSDN 。这些我不知道如何解释。以下是一些演示场景的SQL:

  SELECT 
CHECKSUM_AGG(T.Number)AS ChecksumAgggregate



(2)
,(3)
,(4)
,(5)
)数字)

声明@t表

类别VARCHAR(15),
值VARCHAR(10)


插入@t

类别
,值


('OneCharacter','a')
,(' OneCharacter','b')
,(''OneCharacter','c')
,('OneCharacter','d')

,('TwoCharacters',' aa')
,('TwoCharacters','bb')
,('TwoCharacters','cc')
,('TwoCharacters','dd')

,('ThreeCharacters','aaa')
,('ThreeThreeCharacters','bbb')
,('ThreeCharacters','ccc')
,('ThreeCharacters' ,'ddd')

,('SixCharacters','aaaaaa')
,('SixCharacters','bbbbbb')
,('SixCharacters','cccccc' )
,('SixCharacte rs','dddddd')

,('AllValues','a')
,('AllValues','b')
,('AllValues',' c')
,('AllValues','d')
,('AllValues','aa')
,('AllValues','bb')
, ('AllValues','cc')
,('AllValues','dd')
,('AllValues','aaa')
,('AllValues','bbb' )
,('AllValues','ccc')
,('AllValues','ddd')
,('AllValues','aaaaaa')
,(' AllValues','bbbbbb')
,('AllValues','cccccc')
,('AllValues','dddddd')

选择
类别,来自@t
的CHECKSUM_AGG(CHECKSUM(Value))
按类别

选择类别,值,CHECKSUM(Value)ValueChecksum
来自@t

在这些示例中,这些查询中的 CHECKSUM_AGG 只是0 。最后一个查询显示,正在向 CHECKSUM_AGG 调用输入的 CHECKSUM 值没有重复。



我希望无论答案如何描述导致 CHECKSUM_AGG 返回0的原因,也会解释这些情况。

解决方案

例如,如果行数为偶数,或者这些值的总和等于某些值(例如14),则会发生这种情况。它们都产生0:

  SELECT CHECKSUM_AGG(N)
FROM(VALUES(1),(2),( 3),(4),(4))AS T(N);

选择CHECKSUM_AGG(N)
FROM(VALUES((2),(3),(4),(5))AS T(N);

但这些不包括:

  SELECT CHECKSUM_AGG(N)
FROM(VALUES(1),(2),(3),(4),(5))AS T(N);

选择CHECKSUM_AGG(N)
FROM(VALUES((2),(3),(4),(6))AS T(N);

14可能只是一个巧合(对于6行中的那个值来说并没有发生),我之所以提到它,只是因为它是我在这里注意到的唯一模式。我不知道有任何这些案例被记录下来。



您还知道 CHECKSUM_AGG 被正式记录为不可靠,即它可能无法准确反映数据更改,对吧?该文档没有对此进行详细说明,只是说有时它甚至不会告诉您发生了变化。 (实际的措辞是但是,校验和不会改变的可能性很小。)



无论如何 Michael Swart似乎发现 CHECKSUM_AGG 只是 XOR (这是他的证明):

 创建表格#f(a浮动); 
GO

插入#f值(RAND());
GO 20

声明@i INT = 0;
选择@i = @i ^ CHECKSUM(a)FROM #f;
从#f中选择@i,CHECKSUM_AGG(CHECKSUM(a));
GO

删除表#f;
GO

他继续解释,这可能会导致误导范围的结果值的很小,因为生成可靠校验和的几率会迅速降低。


It seems that there are a number of conditions that cause CHECKSUM_AGG to return 0 that I wouldn't have expected. I've only been able to find one discussed, which is that duplicate values will cause it. That can be solved via a DISTINCT or GROUP BY.

I've also found a couple more scenarios that make less sense to me. One was provided by my boss and one I found MSDN. These I don't know how to explain. Here is some SQL that demonstrates the scenarios:

SELECT
    CHECKSUM_AGG(T.Number) AS ChecksumAgggregate
FROM
(
    VALUES
        (2)
        , (3)
        , (4)
        , (5)
)AS T(Number)

DECLARE @t TABLE 
(
    Category VARCHAR(15),
    Value VARCHAR(10)
)

INSERT @t 
(
    Category
    , Value
)
VALUES 
    ('OneCharacter','a')
    ,('OneCharacter','b')
    ,('OneCharacter','c')
    ,('OneCharacter','d')

    ,('TwoCharacters','aa')
    ,('TwoCharacters','bb')
    ,('TwoCharacters','cc')
    ,('TwoCharacters','dd')

    ,('ThreeCharacters','aaa')
    ,('ThreeCharacters','bbb')
    ,('ThreeCharacters','ccc')
    ,('ThreeCharacters','ddd')

    ,('SixCharacters','aaaaaa')
    ,('SixCharacters','bbbbbb')
    ,('SixCharacters','cccccc')
    ,('SixCharacters','dddddd')

    ,('AllValues','a')
    ,('AllValues','b')
    ,('AllValues','c')
    ,('AllValues','d')
    ,('AllValues','aa')
    ,('AllValues','bb')
    ,('AllValues','cc')
    ,('AllValues','dd')
    ,('AllValues','aaa')
    ,('AllValues','bbb')
    ,('AllValues','ccc')
    ,('AllValues','ddd')
    ,('AllValues','aaaaaa')
    ,('AllValues','bbbbbb')
    ,('AllValues','cccccc')
    ,('AllValues','dddddd')

select 
    Category, CHECKSUM_AGG(CHECKSUM(Value)) 
from @t 
group by Category

select Category, Value, CHECKSUM(Value) ValueChecksum
from @t

It's nothing but 0's in these examples from CHECKSUM_AGG from these queries. The last query shows that none of the CHECKSUM values that are getting input into the CHECKSUM_AGG call are duplicated.

I'm hoping that whatever answer describes what causes CHECKSUM_AGG to return 0 will explain these situations as well.

解决方案

This can happen if, for example, there is an even number of rows, or if the values sum up to certain values (e.g. 14). These both yield 0:

SELECT CHECKSUM_AGG(N)
FROM (VALUES (1),(2),(3),(4),(4)) AS T(N);

SELECT CHECKSUM_AGG(N)
FROM (VALUES (2),(3),(4),(5)) AS T(N);

But these do not:

SELECT CHECKSUM_AGG(N)
FROM (VALUES (1),(2),(3),(4),(5)) AS T(N);

SELECT CHECKSUM_AGG(N)
FROM (VALUES (2),(3),(4),(6)) AS T(N);

14 may just be a coincidence (and it doesn't happen for that value across 6 rows), I mention it only because it's the only pattern I noticed here. I don't know that any of these cases are documented.

Also you are aware that CHECKSUM_AGG is officially documented as being not reliable, i.e. it may not accurately reflect a data change, right? The doc doesn't go into details on this, it just states that sometimes it won't tell you there's been a change even if there has. (The actual wording is "However, there is a small chance that the checksum will not change.")

Anyway Michael Swart seems to have discovered that CHECKSUM_AGG is just XOR (here is his proof):

CREATE TABLE #f(a FLOAT);
GO

INSERT #f VALUES (RAND());
GO 20

DECLARE @i INT = 0;
SELECT @i = @i ^ CHECKSUM(a) FROM #f;
SELECT @i, CHECKSUM_AGG(CHECKSUM(a)) FROM #f;
GO

DROP TABLE #f;
GO

As he goes on to explain, this can lead to misleading results when the range of values is very small, since the odds of generating a reliable checksum diminishes rapidly.

这篇关于什么情况导致CHECKSUM_AGG返回0?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆