如何在不使用临时文件的情况下在python中的tarfile中写入大量数据 [英] How to write a large amount of data in a tarfile in python without using temporary file
问题描述
我用 python 编写了一个小型加密模块,其任务是加密文件并将结果放入 tarfile.要加密的原始文件可以很大,但这不是问题,因为我的程序一次只需要处理一小块数据,可以即时加密并存储.
I've wrote a small cryptographic module in python whose task is to cipher a file and put the result in a tarfile. The original file to encrypt can be quit large, but that's not a problem because my program only need to work with a small block of data at a time, that can be encrypted on the fly and stored.
我正在寻找一种避免分两次执行的方法,首先将所有数据写入临时文件,然后将结果插入 tarfile.
I'm looking for a way to avoid doing it in two passes, first writing all the data in a temporary file then inserting result in a tarfile.
基本上我执行以下操作(其中 generator_encryptor 是一个简单的生成器,它产生从源文件读取的数据块).:
Basically I do the following (where generator_encryptor is a simple generator that yield chunks of data read from sourcefile). :
t = tarfile.open("target.tar", "w")
tmp = file('content', 'wb')
for chunk in generator_encryptor("sourcefile"):
tmp.write(chunks)
tmp.close()
t.add(content)
t.close()
我有点恼火必须使用临时文件,因为我归档它应该很容易直接在 tar 文件中写入块,但是在单个字符串中收集每个块并使用类似 t.addfile('content', StringIO(bigcipheredstring) 似乎被排除在外,因为我不能保证我有足够的内存来存储旧的 bigcipheredstring.
I'm a bit annoyed having to use a temporary file as I file it should be easy to write blocs directly in the tar file, but collecting every chunks in a single string and using something like t.addfile('content', StringIO(bigcipheredstring) seems excluded because I can't guarantee that I have memory enough to old bigcipheredstring.
任何提示如何做到这一点?
Any hint of how to do that ?
推荐答案
您可以创建自己的类文件对象并传递给 TarFile.addfile.您的类文件对象将在 fileobj.read() 方法中动态生成加密内容.
You can create an own file-like object and pass to TarFile.addfile. Your file-like object will generate the encrypted contents on the fly in the fileobj.read() method.
这篇关于如何在不使用临时文件的情况下在python中的tarfile中写入大量数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!