如何在不占用所有内存的情况下使用python-gnupg加密大型数据集? [英] How to encrypt a large dataset using python-gnupg without sucking up all the memory?

查看:66
本文介绍了如何在不占用所有内存的情况下使用python-gnupg加密大型数据集?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的磁盘上有一个非常大的文本文件.假设它是1 GB或更大.还要假设此文件中的数据每120个字符有一个\n字符.

I have a very large text file on disk. Assume it is 1 GB or more. Also assume the data in this file has a \n character every 120 characters.

我正在使用 python-gnupg 对此文件进行加密. 由于文件太大,我无法一次将整个文件读入内存.

I am using python-gnupg to encrypt on this file. Since the file is so large, I cannot read the entire file into memory at one time.

但是,我正在使用的gnupg.encrypt()方法要求我一次发送所有数据,而不是分块发送.那么如何在不耗尽我所有系统内存的情况下加密文件?

However, the gnupg.encrypt() method that I'm using requires that I send in all the data at once -- not in chunks. So how can I encrypt the file without using up all my system memory?

以下是一些示例代码:

import gnupg
gpg = gnupg.GPG(gnupghome='/Users/syed.saqibali/.gnupg/')

for curr_line in open("VeryLargeFile.txt", "r").xreadlines():
    encrypted_ascii_data = gpg.encrypt(curr_line, "recipient@gmail.com")
    open("EncryptedOutputFile.dat", "a").write(encrypted_ascii_data)

此示例产生无效的输出文件,因为我不能简单地将加密的blob串联到一个文件中.

This sample produces an invalid output file because I cannot simply concatenate encrypted blobs together into a file.

推荐答案

我在open命令中添加了"b"(用于二进制),它对我的​​代码非常有用.由于某种原因,这种方式的加密速度比通过shell/bash命令加密的速度慢一半.

I added a "b" (for binary) to the open command and it worked great for my code. Encrypting this way for some reason is slower than half the speed of encrpyting via shell/bash command though.

这篇关于如何在不占用所有内存的情况下使用python-gnupg加密大型数据集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆