Hashlib Python模块的方法更新中的最大字节数限制 [英] Max limit of bytes in method update of Hashlib Python module

查看：138 发布时间：2020/6/17 19:32:37 python hashlib

本文介绍了Hashlib Python模块的方法更新中的最大字节数限制的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用hashlib模块中的函数 hashlib.md5()计算文件的md5哈希值.

I am trying to compute md5 hash of a file with the function hashlib.md5() from hashlib module.

所以我写了这段代码:

Buffer = 128
f = open("c:\\file.tct", "rb")
m = hashlib.md5()

while True:
   p = f.read(Buffer)
   if len(p) != 0:
      m.update(p)
   else:
      break
print m.hexdigest()
f.close()

我注意到，如果我用64、128、256等增加Buffer变量值，则函数更新会更快. 有不能超过的上限吗?我想这可能只是RAM内存问题，但我不知道.

I noted the function update is faster if I increase Buffer variable value with 64, 128, 256 and so on. There is a upper limit I cannot exceed? I suppose it might only a RAM memory problem but I don't know.

推荐答案

大(≈2**40)块大小导致MemoryError，即除可用RAM外没有其他限制.另一方面，bufsize受我的计算机上的2**31-1限制:

Big (≈2**40) chunk sizes lead to MemoryError i.e., there is no limit other than available RAM. On the other hand bufsize is limited by 2**31-1 on my machine:

import hashlib
from functools import partial

def md5(filename, chunksize=2**15, bufsize=-1):
    m = hashlib.md5()
    with open(filename, 'rb', bufsize) as f:
        for chunk in iter(partial(f.read, chunksize), b''):
            m.update(chunk)
    return m

大的chunksize可能和很小的一样慢.测量它.

Big chunksize can be as slow as a very small one. Measure it.

我发现对于≈10 MB文件，2**15 chunksize是我测试过的文件最快的文件.

I find that for ≈10MB files the 2**15 chunksize is the fastest for the files I've tested.

这篇关于Hashlib Python模块的方法更新中的最大字节数限制的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Hashlib Python模块的方法更新中的最大字节数限制 [英] Max limit of bytes in method update of Hashlib Python module

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Hashlib Python模块的方法更新中的最大字节数限制 [英] Max limit of bytes in method update of Hashlib Python module

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭