给定字符串的唯一编号(HASH) [英] Unique number (HASH) for a given string
问题描述
我想生成给定字符串的唯一编号。该字符串长约50-150个字符。我正在考虑使用GetHashCode但不确定它,如果是最好的选择。挑战是生成的数字应该独立于任何平台,这意味着我想使用纯数学逻辑,以便它在任何地方都可以工作?
输入设置不会很大我可以说它只有几千个。
我唯一的要求是它应该在所有平台上生成相同的数字,不应该为不同的字符串重复数字
任何建议。
Hi,
I want to generate a unique number of a given string. The string is around 50-150 character long. I was thinking of using GetHashCode but not sure of it, if is the best bet. The challenge is the generated number should be independent of any platform it means i want to use pure mathematical logic so that it work same everywhere?
The input set won't be large i can say it will be in thousands only.
The only requirement I have is it should generate same number on all platforms and number should not get repeated for different string
Any suggestions.
推荐答案
任何哈希函数最终都会导致冲突,因此您将获得不同输入的相同值(即哈希函数的性质)。
根据您的使用情况,您可以从GetHashCode()
开始,尽管它只在.net平台上,并且可能在.net版本之间发生变化。 />
您可以在这里查看MurMur哈希函数:
blog.teamleadnet.com/2012/08/murmurhash3-ultra-fast-hash-algorithm.html
https://github.com/darrenkopp/murmurhash-net/ [ ^ ]
Any hash function will result in a collision eventually so you will get the same value for different inputs (that is the nature of a hash function).
Depending on your use case you can start withGetHashCode()
, although it is only on the .net platform and may change between .net versions.
You can look at MurMur hash functions here :
blog.teamleadnet.com/2012/08/murmurhash3-ultra-fast-hash-algorithm.html
https://github.com/darrenkopp/murmurhash-net/[^]
Object.GetHashCode() [ ^ ]不适用于此:
此外,.NET Framework不保证GetHashCode方法的默认实现,并且此方法返回的值可能因.NET Framework版本和平台而异,例如32位和64位平台。
您的第二个要求任何两个不相同的输入必须不共享相同的哈希对于任何哈希都是难以理解的函数。
根据定义,散列函数从任何给定的输入创建一个固定长度的字节序列。因此,给定最大数量的非重复输出。您根本无法保证在几千个输入的集合中没有结点。
如果您已经可以定义所有可能的输入,请写一个软件
1)创建所有这些软件
2)为每个人创建一个哈希,随机数,连续数字,无论
3)检查输出是否等于任何已经存在的输出
4)while(3),repeat(2)
5)将所有这些存储在查找表中
在运行时,使用LUT而不是函数。
Object.GetHashCode()[^] won't work for that:
"Furthermore, the .NET Framework does not guarantee the default implementation of the GetHashCode method, and the value this method returns may differ between .NET Framework versions and platforms, such as 32-bit and 64-bit platforms."
Your second requirement "Any two non-identical inputs must not share the same hash" is difficoult for any hash function.
A hash function, per definition, creates a fixed-length byte sequence from any given input. So there's a given maximum number of non-repeating outputs. You simply cannot guarantee that there's no junction in your set of a few thousand inputs.
If you can already define all possible inputs, write a software that
1) creates all of them
2) for each of them creates a hash, random number, consecutive number, whatever
3) checks if output equals any already existing output
4) while (3), repeat (2)
5) store all this in a look-up-table
At runtime, use the LUT instead of a function.
这篇关于给定字符串的唯一编号(HASH)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!