TSQL 校验和难题 [英] TSQL CHECKSUM conundrum

查看:24
本文介绍了TSQL 校验和难题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

SELECT BINARY_CHECKSUM('Clifton House, Thornaby Place, Teesdale South, Stockton-On-Tees, Cleveland, TS17 6SD')
SELECT BINARY_CHECKSUM('Clifton House, Teesdale South, Thornaby Place, Stockton-On-Tees, Cleveland, TS17 6SD')

SELECT BINARY_CHECKSUM('Glenfield Hospital, Groby Road, , Leicester, Leicestershire, LE3 9DZ')
SELECT BINARY_CHECKSUM('Glenfield Hospital, Groby Road, , Leicester, Leicestershire, LE3 9EJ')

看看上面的内容.即使文本中存在差异,这 2 对地址也会生成相同的校验和值.我的理解是,虽然您不能保证 CHECKSUM 和 BINARY_CHECKSUM 对于任何随机内容都不同,但它们应该有助于确定给定行中相对较小的变化.

Have a look at the above. The 2 pairs of addresses will generate the same checksum value even though there are differences in the text. It is my understanding that, whilst you cannot guarantee that CHECKSUM and BINARY_CHECKSUM will be different for any random content that they should be good for determining relatively small changes in a given row.

有趣的是,这对值正好相反.它们为非常相似的数据值生成相等的校验和值.实际上,这些是较大(680,000 条记录)地址表中唯一重复的校验和值.

Interestingly these pairs of values demonstrate precisely the opposite. They are generating equal checksum values for very similar data values. These are in fact the only duplicate checksum values in a largish (680,000 record) table of addresses.

我有点担心我在生成 UPDATE 时误解了校验和的值?我是否必须通过逐场比较的蛮力场才能绝对确定在一行数据中发现变化?

I am a little concerned that I have misunderstood the value of checksum in generating UPDATEs? Do I have to resort to a brute force field by field comparison to be absolutely certain of picking up a change in a row of data?

这些示例的原始数据位于 6 个单独的列中.为清晰起见,我已将代码示例缩减为最小状态.

The original data for these examples was in 6 separate columns. I have reduced the code sample to a minimal state for clarity.

推荐答案

校验和从来都不是万无一失的.对于完全始终正确的解决方案,暴力解决方案就是这样.不过也有一些不太强烈的方法可以奏效.

Checksums are never completely foolproof. For a totally-always-correct-no-matter-what-solution brute force is the way. There are some less intense methods that can work though.

校验和几乎是一个非常简单的哈希.尝试改用哈希字节.那么你可以使用更混乱的算法,比如 md5.

A checksum is pretty much a really simple hash. Try using hashbytes instead. then you can make use of a more chaotic algorithm like md5.

例如:

SELECT HASHBYTES('MD5', 'Glenfield Hospital, Groby Road, , Leicester, Leicestershire, LE3 9DZ')

这篇关于TSQL 校验和难题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆