在python中哈希unicode字符串 [英] hash unicode string in python
问题描述
我尝试哈希一些unicode字符串:
I try to hash some unicode strings:
hashlib.sha1(s).hexdigest()
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-81:
ordinal not in range(128)
其中s
类似于:
œ∑¡™£¢∞§¶•ºº–≠œ∑´®†¥¨ˆøπ'åß∂ƒ©˙∆˚¬ ... æΩ≈ç√∫˜µ≤≥÷åйцукенгшщзхъфывапролджэячсмитьбюю..юбьтијџўќ† њѓs''"««««\dzћ÷…•∆љl«єђxcvіƒm≤≥ї!@#$© ∆∆∆∆∆∆∆∆∆∆•…÷ћzdzћ÷...•∆љlљ∆•...÷ћzћ÷...•∆љ∆•... љ∆•... љ∆•... ∆љ•... ∆љ•... љ∆ •…∆•…∆•... ∆•∆ ...•÷∆•...÷∆•...÷∆•...÷∆•...÷∆•...÷∆•...÷∆•...
œ∑¡™£¢∞§¶•ªº–≠œ∑´®†¥¨ˆøπ"‘åß∂ƒ©˙∆˚¬…æΩ≈ç√∫˜µ≤≥÷åйцукенгшщзхъфывапролджэячсмитьбююю..юбьтијџўќ†њѓѕ'‘"«««\dzћ÷…•∆љl«єђxcvіƒm≤≥ї!@#$©^&*(()––––––––––∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆∆•…÷ћzdzћ÷…•∆љlљ∆•…÷ћzћ÷…•∆љ∆•…љ∆•…љ∆•…∆љ•…∆љ•…љ∆•…∆•…∆•…∆•∆…•÷∆•…÷∆•…÷∆•…÷∆•…÷∆•…÷∆•…÷∆•…
我应该解决什么?
推荐答案
显然,hashlib.sha1
期望的不是unicode
对象,而是期望的str
对象中的字节序列.将您的unicode
字符串编码为一个字节序列(使用UTF-8编码)应该可以解决该问题:
Apparently hashlib.sha1
isn't expecting a unicode
object, but rather a sequence of bytes in a str
object. Encoding your unicode
string to a sequence of bytes (using, say, the UTF-8 encoding) should fix it:
>>> import hashlib
>>> s = u'é'
>>> hashlib.sha1(s.encode('utf-8'))
<sha1 HASH object @ 029576A0>
错误是因为它试图使用默认的ascii
编码将unicode
对象自动转换为str
,该编码无法处理所有那些非ASCII字符(因为您的字符串不是纯ASCII).
The error is because it is trying to convert the unicode
object to a str
automatically, using the default ascii
encoding, which can't handle all those non-ASCII characters (since your string isn't pure ASCII).
Python文档是了解更多有关Unicode和编码的一个很好的起点,这是 Articles by Joel Spolsky .
A good starting point for learning more about Unicode and encodings is the Python docs, and this article by Joel Spolsky.
这篇关于在python中哈希unicode字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!