奇怪的python的hashlib.md5行为,每次都有不同的哈希 [英] Strange python's hashlib.md5 behavior, different hash each time

查看:522
本文介绍了奇怪的python的hashlib.md5行为,每次都有不同的哈希的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在尝试计算字符串的md5哈希值时遇到了一些非常奇怪的行为.如果我传递的是串联结果,则返回的哈希值总是错误的(且与众不同).获得真正的哈希的唯一方法是传递创建后未经任何修改的字符串.

I've faced some really strange behavior trying to calculate md5 hash of string. Returned hash is always wrong (and different) if I pass string that was result of concatenation. Only way to get real hash I've found is to pass string that wasn't modified in any way after creation.

Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:42:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import hashlib
>>> m = hashlib.md5() 
>>> a1 = "stack"
>>> a2 = "overflow"
>>> a3 = a1 + a2
>>> a4 = str(a1 + a2)
>>> m.update("stackoverflow")
>>> m.hexdigest()
'73868cb1848a216984dca1b6b0ee37bc' //actuall hash
>>> m.update(a1 + a2)
>>> m.hexdigest()
'458b7358b9e0c3f561957b96e543c5a8'
>>> m.update(a3)
>>> m.hexdigest()
'65b0e62d4ff2d91e111ecc8f27f0e8f5'
>>> m.update(a4)
>>> m.hexdigest()
'60c3ae3dd9a2095340b2e024194bad3c'
>>> m.update(a1 + a2)
>>> m.hexdigest()
'acd4e14145d34dcb10af785badf8e73e'
>>> m.update(a1 + a2)
>>> m.hexdigest()
'03c06ca09faa26166f1096db02272b11'
>>> a1 + a2 == a1 + a2
True
>>> a1 + a2 == a3
True
>>> a3 == a4
True

我想念什么吗?

推荐答案

您缺少的是hash.update() 不会替换散列数据.您将不断更新哈希对象,因此将获得串联字符串的哈希.从 hashlib.hash.update()文档:

What you are missing is that hash.update() doesn't replace the hashed data. You are continually updating the hash object, so you are getting the hash of the concatenated strings. From the hashlib.hash.update() documentation:

使用字符串 arg 更新哈希对象. 重复调用等效于将所有参数串联在一起的单个调用:m.update(a); m.update(b)等同于m.update(a+b).

Update the hash object with the string arg. Repeated calls are equivalent to a single call with the concatenation of all the arguments: m.update(a); m.update(b) is equivalent to m.update(a+b).

强调粗体.

因此,您不会获取单个'stackoverflow'字符串的哈希,而是每次追加,然后是'stackoverflowstackoverflow',然后是'stackoverflowstackoverflowstackoverflow'等哈希,另一个'stackoverflow'创建一个越来越长的字符串.这些较长的字符串都不与原始的较短的字符串相等,因此它们的哈希也不太可能相等.

So you are not getting the hash of a single 'stackoverflow' string, you are getting the hash first of 'stackoverflow', then of 'stackoverflowstackoverflow', then 'stackoverflowstackoverflowstackoverflow' etc., each time appending another 'stackoverflow' creating a longer and longer string. None of those longer strings are equal to the original short string so their hashes are not likely to be equal either.

为新字符串创建 new 对象,

>>> import hashlib
>>> m = hashlib.md5()
>>> m.update('stack' + 'overflow')
>>> m.hexdigest()
'73868cb1848a216984dca1b6b0ee37bc'
>>> m = hashlib.md5()   # **new** hash object
>>> m.update('stackoverflow')
>>> m.hexdigest()
'73868cb1848a216984dca1b6b0ee37bc'
>>> m = hashlib.md5()     # new object again
>>> m.update('stack')     # add the string in pieces, part 1
>>> m.update('overflow')  # and part 2
>>> m.hexdigest()
'73868cb1848a216984dca1b6b0ee37bc'

您可以通过发送串联的数据来轻松产生错误的"哈希值:

You can readily produce your 'wrong' hashes by sending in concatenated data:

>>> m = hashlib.md5()
>>> m.update('stackoverflowstackoverflow')
>>> m.hexdigest()
'458b7358b9e0c3f561957b96e543c5a8'
>>> m = hashlib.md5()
>>> m.update('stackoverflowstackoverflowstackoverflow')
>>> m.hexdigest()
'65b0e62d4ff2d91e111ecc8f27f0e8f5'
>>> m = hashlib.md5()
>>> m.update('stackoverflow' * 4)
>>> m.hexdigest()
'60c3ae3dd9a2095340b2e024194bad3c'

请注意,您也可以将第一个字符串传递给md5()函数:

Note that you can also pass in the first string into the md5() function:

>>> hashlib.md5('stackoverflow').hexdigest()
'73868cb1848a216984dca1b6b0ee37bc'

通常,仅当您正在分块处理数据时才使用hash.update()方法(例如逐行读取文件或从套接字读取数据块),并且不想保留所有数据立刻在内存中.

You normally use the hash.update() method only if you are processing data in chunks (like reading a file line by line or reading blocks of data from a socket), and don't want to have to hold all of that data in memory at once.

这篇关于奇怪的python的hashlib.md5行为,每次都有不同的哈希的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆