Hive内置的HASH()函数使用什么样的hash算法 [英] What kind of hash algorithm is used for Hive's built-in HASH() Function

查看:244
本文介绍了Hive内置的HASH()函数使用什么样的hash算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

内置的HASH()函数使用了什么样的哈希算法?

What kind of hashing algorithm is used in the built-in HASH() function?

我正在理想地寻找 SHA512/SHA256 散列,类似于 SHA() 函数在用于 Pig 的linkedin datafu UDF 中提供的内容.

I'm ideally looking for a SHA512/SHA256 hash, similar to what the SHA() function offers within the linkedin datafu UDFs for Pig.

推荐答案

HASH 函数(从 Hive 0.11 开始)使用类似于 java.util.List#hashCode.

HASH function (as of Hive 0.11) uses algorithm similar to java.util.List#hashCode.

它的代码如下:

int hashCode = 0; // Hive HASH uses 0 as the seed, List#hashCode uses 1. I don't know why.
for (Object item: items) {
   hashCode = hashCode * 31 + (item == null ? 0 : item.hashCode());
}

基本上它是 Effective Java 一书中推荐的经典哈希算法.引用一个伟人(和一本伟大的书籍):

Basically it's a classic hash algorithm as recommended in the book Effective Java. To quote a great man (and a great book):

选择值 31 是因为它是一个奇质数.如果它甚至乘法溢出,信息会丢失,因为乘以 2 相当于移位.使用的好处素数不太清楚,但它是传统的.31 的一个不错的属性是乘法可以用移位和减法代替为获得更好的性能:31 * i == (i << 5) - i.现代虚拟机这样做自动优化.

The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i << 5) - i. Modern VMs do this sort of optimization automatically.

我离题了.可以查看HASH源码这里.

I digress. You can look at the HASH source here.

如果你想在 Hive 中使用 SHAxxx 那么你可以使用 Apache DigestUtils 类和 Hive 内置的 reflect 函数(我希望它会起作用):

If you want to use SHAxxx in Hive then you can use Apache DigestUtils class and Hive built-in reflect function (I hope that'll work):

SELECT reflect('org.apache.commons.codec.digest.DigestUtils', 'sha256Hex', 'your_string')

这篇关于Hive内置的HASH()函数使用什么样的hash算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆