在Python中从字符串生成ID [英] Generate ID from string in Python
问题描述
我为Python中的给定 string
生成了类型为 integer
的ID。
我认为构建它的 hash
函数是完美的,但它有时似乎太长。这是一个问题,因为我限制为64位的最大长度。
到目前为止我的代码: hash(s)%10000000000
。
我期望的输入字符串将在12-512个字符的范围内。
要求是:
- 整数
- 从提供的字符串中生成
- 理想情况下最多为10-12个字符长(我只有500万件物品)
- 低碰撞概率..?
我会很高兴如果有人能提供任何提示/解决方案。 p>
>>> import hashlib
>>> m = hashlib.md5()
>>> m.update(some string)
>>> str(int(m.hexdigest(),16))[0:12]
'120665287271'
想法:
- 以十六进制形式计算MD5(或SHA-1或...)字符串的散列(请参阅 hashlib )
- 转换将字符串转换为整数并将其重新转换为基数为10的字符串(结果中只有数字)
- 使用字符串的前12个字符。$ b $如果字符
af
也可以,我会这样做m.hexdigest()
。
[0:12]
I'm struggling a bit to generate ID of type
integer
for givenstring
in Python.I thought the built-it
hash
function is perfect but it appears that the IDs are too long sometimes. It's a problem since I'm limited to 64bits as maximum length.My code so far:
hash(s) % 10000000000
. The input string(s) which I can expect will be in range of 12-512 chars long.Requirements are:
- integers only
- generated from provided string
- ideally up to 10-12 chars long (I'll have ~5 million items only)
- low probability of collision..?
I would be glad if someone can provide any tips / solutions.
解决方案I would do something like this:
>>> import hashlib >>> m = hashlib.md5() >>> m.update("some string") >>> str(int(m.hexdigest(), 16))[0:12] '120665287271'
The idea:
- Calculate the hash of a string with MD5 (or SHA-1 or ...) in hexadecimal form (see module hashlib)
- Convert the string into an integer and reconvert it to a String with base 10 (there are just digits in the result)
- Use the first 12 characters of the string.
If characters
a-f
are also okay, I would dom.hexdigest()[0:12]
.这篇关于在Python中从字符串生成ID的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!