基于.NET ComputeHash SQL CLR函数不与Cyrrilic工作 [英] SQL CLR function based on .net ComputeHash is not working with Cyrrilic

查看:227
本文介绍了基于.NET ComputeHash SQL CLR函数不与Cyrrilic工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了下面的 SQL CLR 功能,以哈希值的字符串再大8000字节(的输入值的极限 T- SQL 内置它 HASHBYTES 功能):

  [SqlFunction(数据访问= DataAccessKind.None,IsDeterministic = TRUE)] 
公共静态SqlBinary HASHBYTES(的SqlString算法的SqlString值)
{
algorithmType的HashAlgorithm = HashAlgorithm.Create(algorithm.Value );

如果(algorithmType == NULL || value.IsNull)
{
返回新SqlBinary();
}
,否则
{
字节[]字节= Encoding.UTF8.GetBytes(value.Value);
返回新SqlBinary(algorithmType.ComputeHash(字节));
}
}



据拉美字符串工作的罚款。例如,下面的哈希值是相同的:

  SELECT dbo.fn_Utils_GetHashBytes('MD5','测试'); -  0x098F6BCD4621D373CADE4E832627B4F6 
选择HASHBYTES('MD5','测试'); - 0x098F6BCD4621D373CADE4E832627B4F6



问题是它没有与西里尔文字符串的工作。例如:

  SELECT dbo.fn_Utils_GetHashBytes(MD5,N'дароветенавлъхвите') -  NULL 
SELECT HASHBYTES(MD5,N'дароветенавлъхвите') - 0x838B1B625A6074B2BE55CDB7FCEA2832

选择dbo.fn_Utils_GetHashBytes('SHA256',N'дароветенавлъхвите') - 0xA1D65374A0B954F8291E00BC3DD9DF655D8A4A6BF127CFB15BBE794D2A098844
选择HASHBYTES (SHA2_256',N'дароветенавлъхвите') - 0x375F6993E0ECE1864336E565C8E14848F2A4BAFCF60BC0C8F5636101DD15B25A

我收到 NULL MD5 ,但如果它是作为控制台应用程序执行的代码返回值。谁能告诉我在做什么错了?






另外,我已经得到了从的此处和一个评论说:




小心使用CLR SP参数被悄悄地截取为8000字节
- 我不得不来标记参数[SqlFacet(MAXSIZE = -1),否则字节第八千会被忽略后


<! /块引用>

但我已经测试了这一点,它是工作的罚款。例如,如果我产生8000个字节的字符串和相同字符串加一个符号的第二哈希散列,我得到的哈希值是不同的。

  DECLARE @A VARCHAR(MAX)= 8000字节串......
DECLARE @B VARCHAR(MAX)= @A +'1'
选择LEN(@A) LEN(@B)

选择IIF(dbo.fn_Utils_GetHashBytes('MD5',@A +'1')= dbo.fn_Utils_GetHashBytes('MD5',@B),1,0) - 0

我应该担心这个?


解决方案

  Encoding.UTF8.GetBytes(...)

SQL Server没有UTF-8的概念。使用UCS-2(UTF-16)或ASCII。使用必须与你传递给 HASHBYTES 什么编码。你可以很容易地看到 HASHBYTES 会出现乱码不同 VARCHAR NVARCHAR

 选择HASHBYTES('MD5','富') -  0x1356C67D7AD1638D816BFB822DD2C25D 
选择HASHBYTES('MD5 ',N'Foo') - 0xB25FF0AD90D09D395090E8A29FF4C63C

最好的是改变了SQLCLR函数接受字节,而不是字符串,并处理投地 VARBINARY 在调用者。

  SELECT dbo.fn_Utils_GetHashBytes('MD5',CAST(N'дароветенавлъхвите'AS VARBINARY(MAX)); 

FYI的SQL Server 2016年已经取消了8000个字节的限制在 HASHBYTES




对于 SQL Server的2014年和更早版本,允许输入值被限制为8000个字节。



I have written the following SQL CLR function in order to hash string values larger then 8000 bytes (the limit of input value of the T-SQL built-it HASHBYTES function):

[SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true)]
public static SqlBinary HashBytes(SqlString algorithm, SqlString value)
{
    HashAlgorithm algorithmType = HashAlgorithm.Create(algorithm.Value);

    if (algorithmType == null || value.IsNull)
    {
        return new SqlBinary();
    }
    else
    {
        byte[] bytes = Encoding.UTF8.GetBytes(value.Value);
        return new SqlBinary(algorithmType.ComputeHash(bytes));
    }
}

It is working fine for Latin strings. For example, the following hashes are the same:

SELECT dbo.fn_Utils_GetHashBytes ('MD5', 'test'); -- 0x098F6BCD4621D373CADE4E832627B4F6
SELECT HASHBYTES ('MD5', 'test');                 -- 0x098F6BCD4621D373CADE4E832627B4F6

The issue is it is not working with Cyrillic strings. For example:

SELECT dbo.fn_Utils_GetHashBytes ('MD5 ', N'даровете на влъхвите') -- NULL
SELECT HashBytes ('MD5 ',N'даровете на влъхвите') -- 0x838B1B625A6074B2BE55CDB7FCEA2832

SELECT dbo.fn_Utils_GetHashBytes ('SHA256', N'даровете на влъхвите') -- 0xA1D65374A0B954F8291E00BC3DD9DF655D8A4A6BF127CFB15BBE794D2A098844
SELECT HashBytes ('SHA2_256',N'даровете на влъхвите') -- 0x375F6993E0ECE1864336E565C8E14848F2A4BAFCF60BC0C8F5636101DD15B25A 

I am getting NULL for MD5, although the code returns value if it is executed as console application. Could anyone tell what I am doing wrong?


Also, I've got the function from here and one of the comments says that:

Careful with CLR SP parameters being silently truncated to 8000 bytes - I had to tag the parameter with [SqlFacet(MaxSize = -1)] otherwise bytes after the 8000th would simply be ignored!

but I have tested this and it is working fine. For example, if I generate a hash of 8000 bytes string and a second hash of the same string plus one symbol, I get the hashes are different.

DECLARE @A VARCHAR(MAX) = '8000 bytes string...'
DECLARE @B VARCHAR(MAX) = @A + '1'
SELECT LEN(@A), LEN(@B)

SELECT IIF(dbo.fn_Utils_GetHashBytes ('MD5', @A + '1') = dbo.fn_Utils_GetHashBytes ('MD5', @B), 1, 0) -- 0

Should I worry about this?

解决方案

 Encoding.UTF8.GetBytes(...)

SQL Server has no concept of UTF-8. Use UCS-2 (UTF-16) or ASCII. The encoding used must match what you'd pass to HASHBYTES. You can easily see that HASHBYTES will hash differently VARCHAR vs. NVARCHAR:

select HASHBYTES('MD5', 'Foo')  -- 0x1356C67D7AD1638D816BFB822DD2C25D
select HASHBYTES('MD5', N'Foo') -- 0xB25FF0AD90D09D395090E8A29FF4C63C

Best would be to change the SQLCLR function to accept the bytes, not a string, and deal with the cast to VARBINARY in the caller.

 SELECT dbo.fn_Utils_GetHashBytes ('MD5', CAST(N'даровете на влъхвите' AS VARBINARY(MAX));

FYI SQL Server 2016 has lifted the 8000 bytes restriction on HASHBYTES:

For SQL Server 2014 and earlier, allowed input values are limited to 8000 bytes.

这篇关于基于.NET ComputeHash SQL CLR函数不与Cyrrilic工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆