持久化 sha256 哈希对象? [英] Persisting sha256 hash objects?

查看:26
本文介绍了持久化 sha256 哈希对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个 Python/C/C++/Java 实现,它可以暂停散列进度存储在文件中的进度,使得进度可在稍后阶段从该文件中恢复.

I need a Python/C/C++/Java implementation which can pause hashing progress and store that progress in a file in such a way that the progress is recoverable from that file at a later stage.

无论它是用上面列出的哪种语言编写的,它都应该在 Python 中正常工作.建议您可以提供它以与hashlib"配合使用,但这不是必需的.另外,如果这样的东西已经存在,一个链接就足够了.

No matter in what language it is written from above listed, it should work properly in Python. It is recommended that you may provide that to work well with "hashlib" however it is not necessary. Also, if such a thing already exist, a link to that is sufficient.

对于一个想法,您的实施应该实现什么.

For an idea, what your implementation should achieve.

import hashlib
import hashpersist #THIS IS NEEDED.

sha256 = hashlib.sha256("Hello ")
hashpersist.save_state(sha256, open('test_file', 'w'))

sha256_recovered = hashpersist.load_state(open('test_file', 'r'))
sha256_recovered.update("World")
print sha256_recovered.hexdigest()

这应该会产生与我们使用标准 sha256 函数对Hello World"进行简单散列的输出相同的输出.

This should give the same output as we had done simple hashing of "Hello World" with standard sha256 function.

a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e

推荐答案

结果证明将 hashlib 重写为可恢复的(至少是 SHA-256 部分)比我想象的要容易.我花了一些时间玩使用 OpenSSL 加密库的 C 代码,但后来我意识到我不需要所有这些东西,我可以只使用 ctypes.

It turned out to be easier than I thought to rewrite hashlib to be resumable, at least, the SHA-256 portion. I spent some time playing with C code that uses the OpenSSL crypto library, but then I realised that I don't need all that stuff, I can just use ctypes.

rehash.py

#! /usr/bin/env python

''' A resumable implementation of SHA-256 using ctypes with the OpenSSL crypto library

    Written by PM 2Ring 2014.11.13
'''

from ctypes import *

SHA_LBLOCK = 16
SHA256_DIGEST_LENGTH = 32

class SHA256_CTX(Structure):
    _fields_ = [
        ("h", c_long * 8),
        ("Nl", c_long),
        ("Nh", c_long),
        ("data", c_long * SHA_LBLOCK),
        ("num", c_uint),
        ("md_len", c_uint)
    ]

HashBuffType = c_ubyte * SHA256_DIGEST_LENGTH

#crypto = cdll.LoadLibrary("libcrypto.so")
crypto = cdll.LoadLibrary("libeay32.dll" if os.name == "nt" else "libssl.so")

class sha256(object):
    digest_size = SHA256_DIGEST_LENGTH

    def __init__(self, datastr=None):
        self.ctx = SHA256_CTX()
        crypto.SHA256_Init(byref(self.ctx))
        if datastr:
            self.update(datastr)

    def update(self, datastr):
        crypto.SHA256_Update(byref(self.ctx), datastr, c_int(len(datastr)))

    #Clone the current context
    def _copy_ctx(self):
        ctx = SHA256_CTX()
        pointer(ctx)[0] = self.ctx
        return ctx

    def copy(self):
        other = sha256()
        other.ctx = self._copy_ctx()
        return other

    def digest(self):
        #Preserve context in case we get called before hashing is
        # really finished, since SHA256_Final() clears the SHA256_CTX
        ctx = self._copy_ctx()
        hashbuff = HashBuffType()
        crypto.SHA256_Final(hashbuff, byref(self.ctx))
        self.ctx = ctx
        return str(bytearray(hashbuff))

    def hexdigest(self):
        return self.digest().encode('hex')

#Tests
def main():
    import cPickle
    import hashlib

    data = ("Nobody expects ", "the spammish ", "imposition!")

    print "rehash
"

    shaA = sha256(''.join(data))
    print shaA.hexdigest()
    print repr(shaA.digest())
    print "digest size =", shaA.digest_size
    print

    shaB = sha256()
    shaB.update(data[0])
    print shaB.hexdigest()

    #Test pickling
    sha_pickle = cPickle.dumps(shaB, -1)
    print "Pickle length:", len(sha_pickle)
    shaC = cPickle.loads(sha_pickle)

    shaC.update(data[1])
    print shaC.hexdigest()

    #Test copying. Note that copy can be pickled
    shaD = shaC.copy()

    shaC.update(data[2])
    print shaC.hexdigest()


    #Verify against hashlib.sha256()
    print "
hashlib
"

    shaD = hashlib.sha256(''.join(data))
    print shaD.hexdigest()
    print repr(shaD.digest())
    print "digest size =", shaD.digest_size
    print

    shaE = hashlib.sha256(data[0])
    print shaE.hexdigest()

    shaE.update(data[1])
    print shaE.hexdigest()

    #Test copying. Note that hashlib copy can NOT be pickled
    shaF = shaE.copy()
    shaF.update(data[2])
    print shaF.hexdigest()


if __name__ == '__main__':
    main()

resumable_SHA-256.py

#! /usr/bin/env python

''' Resumable SHA-256 hash for large files using the OpenSSL crypto library

    The hashing process may be interrupted by Control-C (SIGINT) or SIGTERM.
    When a signal is received, hashing continues until the end of the
    current chunk, then the current file position, total file size, and
    the sha object is saved to a file. The name of this file is formed by
    appending '.hash' to the name of the file being hashed.

    Just re-run the program to resume hashing. The '.hash' file will be deleted
    once hashing is completed.

    Written by PM 2Ring 2014.11.14
'''

import cPickle as pickle
import os
import signal
import sys

import rehash

quit = False

blocksize = 1<<16   # 64kB
blocksperchunk = 1<<8

chunksize = blocksize * blocksperchunk

def handler(signum, frame):
    global quit
    print "
Got signal %d, cleaning up." % signum
    quit = True


def do_hash(fname, filesize):
    hashname = fname + '.hash'
    if os.path.exists(hashname):
        with open(hashname, 'rb') as f:
            pos, fsize, sha = pickle.load(f)
        if fsize != filesize:
            print "Error: file size of '%s' doesn't match size recorded in '%s'" % (fname, hashname)
            print "%d != %d. Aborting" % (fsize, filesize)
            exit(1)
    else:
        pos, fsize, sha = 0, filesize, rehash.sha256()

    finished = False
    with open(fname, 'rb') as f:
        f.seek(pos)
        while not (quit or finished):
            for _ in xrange(blocksperchunk):
                block = f.read(blocksize)
                if block == '':
                    finished = True
                    break
                sha.update(block)

            pos += chunksize
            sys.stderr.write(" %6.2f%% of %d
" % (100.0 * pos / fsize, fsize))
            if finished or quit:
                break

    if quit:
        with open(hashname, 'wb') as f:
            pickle.dump((pos, fsize, sha), f, -1)
    elif os.path.exists(hashname):
        os.remove(hashname)

    return (not quit), pos, sha.hexdigest()


def main():
    if len(sys.argv) != 2:
        print "Resumable SHA-256 hash of a file."
        print "Usage:
python %s filename
" % sys.argv[0]
        exit(1)

    fname = sys.argv[1]
    filesize = os.path.getsize(fname)

    signal.signal(signal.SIGINT, handler)
    signal.signal(signal.SIGTERM, handler)

    finished, pos, hexdigest = do_hash(fname, filesize)
    if finished:
        print "%s  %s" % (hexdigest, fname)
    else:
        print "sha-256 hash of '%s' incomplete" % fname
        print "%s" % hexdigest
        print "%d / %d bytes processed." % (pos, filesize)


if __name__ == '__main__':
    main()

演示

import rehash
import pickle
sha=rehash.sha256("Hello ")
s=pickle.dumps(sha.ctx)
sha=rehash.sha256()
sha.ctx=pickle.loads(s)
sha.update("World")
print sha.hexdigest()

输出

a591a6d40bf420404a011733cfb7b190d62c65bf0bcda32b57b277d9ad9f146e

编辑

我刚刚做了一个小修改,让 rehash 也能在 Windows 上运行,尽管我只在 WinXP 上测试过.libeay32.dll 可以在当前目录中,也可以在系统库搜索路径中的某个位置,例如WINDOWSsystem32.尽管 OpenOffice 和 Avira 使用了 .dll,但我相当古老(并且大部分未使用)的 XP 安装找不到 .dll.所以我只是将它从 Avira 文件夹复制到 system32.现在它可以完美运行.:)

I've just made a minor edit to allow rehash to work on Windows, too, although I've only tested it on WinXP. The libeay32.dll can be in the current directory, or somewhere in the system library search path, eg WINDOWSsystem32. My rather ancient (and mostly unused) XP installation couldn't find the .dll, even though it's used by OpenOffice and Avira. So I just copied it from the Avira folder to system32. And now it works perfectly. :)

这篇关于持久化 sha256 哈希对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆