计算散列而不使用Hive中的现有散列函数 [英] Calculate hash without using exisiting hash fuction in Hive
问题描述
我想为hive中的字符串计算散列,而不使用现有函数编写任何UDF。这样我就可以使用类似的方法在其他语言中获得一致的散列。例如:是否有任何函数可以用来添加字符或XOR。
这取决于版本Hive,参见。 https://cwiki.apache.org/confluence/显示/ Hive / LanguageManual + UDF#LanguageManualUDF-Misc.Functions
$ b
从ABC选择XYZ,哈希(XYZ)
已经有好几年了,并且使用普通的旧的 java.lang.String.hashCode()
,返回一个INT(32位散列)
从ABC选择XYZ,crc32(XYZ)
需要Hive 1.3并适用旧的循环冗余检查(可能通过 java.util.zip.CRC32
),返回一个BIGINT(32位散列)
从ABC
$ b $选择XYZ,md5(XYZ),sha1(XYZ),sha2(XYZ,256),sha2(XYZ,512) b需要Hive 1.3,并应用强大的密码散列函数,并返回带有二进制(128,160,256和512位散列)的十六进制表示的STRING
答案 reflect()
来为旧Hive版本应用加密哈希函数的一个非常好的解决方法。
I want to calculate hash for strings in hive without writing any UDF only using exisiting functions . So that I can use similar approach to get consistent hash in other languages. for ex : are there any functions using which I can do something like adding characters or taking Xor.
It depends on the version of Hive, cf. https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Misc.Functions
select XYZ, hash(XYZ) from ABC
has been available for years and applies plain old java.lang.String.hashCode()
, returning an INT (32 bit hash)
select XYZ, crc32(XYZ) from ABC
requires Hive 1.3 and applies plain old Cyclic Redundancy Check (probably via java.util.zip.CRC32
), returning a BIGINT (32 bit hash)
select XYZ, md5(XYZ), sha1(XYZ), sha2(XYZ,256), sha2(XYZ,512) from ABC
requires Hive 1.3 and applies strong, cryptographic hash functions, returning a STRING with the hexadecimal representation of the binary (128, 160, 256 and 512 bit hashes)
[Edit] the answer to that post has also a very good workaround for applying crypto hash functions with older versions of Hive, using Apache Commons static methods and reflect()
.
这篇关于计算散列而不使用Hive中的现有散列函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!