有没有更快的方法(比此更快)在Python中计算文件的哈希(使用hashlib)? [英] Is there a faster way (than this) to calculate the hash of a file (using hashlib) in Python?

查看：77 发布时间：2020/6/17 19:33:14 python hashlib

本文介绍了有没有更快的方法(比此更快)在Python中计算文件的哈希(使用hashlib)?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前的做法是:

def get_hash(path=PATH, hash_type='md5'):
    func = getattr(hashlib, hash_type)()
    with open(path, 'rb') as f:
         for block in iter(lambda: f.read(1024*func.block_size, b''):
             func.update(block)
    return func.hexdigest()

在i5 @ 1.7 GHz上计算842MB iso文件的md5sum大约需要3.5秒.我尝试了多种读取文件的方法，但是所有方法产生的结果都较慢.也许有更快的解决方案?

It takes about 3.5 seconds to calculate the md5sum of a 842MB iso file on an i5 @ 1.7 GHz. I have tried different methods of reading the file, but all of them yield slower results. Is there, perhaps, a faster solution?

我将2**16(在f.read()内)替换为1024*func.block_size，因为hashlib支持的大多数哈希函数的默认block_size为64("sha384"和"sha512"除外-对于他们，默认的block_size是128).因此，块大小仍然相同(65536位).

I replaced 2**16 (inside the f.read()) with 1024*func.block_size, since the default block_size for most hashing functions supported by hashlib is 64 (except for 'sha384' and 'sha512' - for them, the default block_size is 128). Therefore, the block size is still the same (65536 bits).

EDIT(2):我做错了.它需要8.4秒而不是3.5秒. :(

EDIT(2): I did something wrong. It takes 8.4 seconds instead of 3.5. :(

EDIT(3):显然，当我再次运行该功能时，Windows使用磁盘的比例为+ 80％.确实需要3.5秒. ew.

EDIT(3): Apparently Windows was using the disk at +80% when I ran the function again. It really takes 3.5 seconds. Phew.

另一种解决方案(〜-0.5秒，速度稍快)是使用os.open():

Another solution (~-0.5 sec, slightly faster) is to use os.open():

def get_hash(path=PATH, hash_type='md5'):
    func = getattr(hashlib, hash_type)()
    f = os.open(path, (os.O_RDWR | os.O_BINARY))
    for block in iter(lambda: os.read(f, 2048*func.block_size), b''):
        func.update(block)
    os.close(f)
    return func.hexdigest()

请注意，这些结果并非最终结果.

Note that these results are not final.

有没有更快的方法(比此更快)在Python中计算文件的哈希(使用hashlib)? [英] Is there a faster way (than this) to calculate the hash of a file (using hashlib) in Python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

有没有更快的方法(比此更快)在Python中计算文件的哈希(使用hashlib)? [英] Is there a faster way (than this) to calculate the hash of a file (using hashlib) in Python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭