Python Gzip-即时添加文件 [英] Python Gzip - Appending to file on the fly

查看:117
本文介绍了Python Gzip-即时添加文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以使用Python动态添加到压缩后的文本文件中?

Is it possible to append to a gzipped text file on the fly using Python ?

基本上我正在这样做:-

Basically I am doing this:-

import gzip
content = "Lots of content here"
f = gzip.open('file.txt.gz', 'a', 9)
f.write(content)
f.close()

每6秒钟左右在文件上附加一行(请注意添加"),但是生成的文件与标准的未压缩文件一样大(完成后大约1MB).

A line is appended (note "appended") to the file every 6 seconds or so, but the resulting file is just as big as a standard uncompressed file (roughly 1MB when done).

明确指定压缩级别似乎也无济于事.

Explicitly specifying the compression level does not seem to make a difference either.

如果我之后将现有的未压缩文件gzip压缩,则其大小将减小到大约80kb.

If I gzip an existing uncompressed file afterwards, it's size comes down to roughly 80kb.

我猜测无法动态"附加到gzip文件并进行压缩吗?

Im guessing its not possible to "append" to a gzip file on the fly and have it compress ?

这是写入String.IO缓冲区,然后在完成后刷新到gzip文件的情况吗?

Is this a case of writing to a String.IO buffer and then flushing to a gzip file when done ?

推荐答案

这在创建和维护有效的gzip文件的意义上起作用,因为gzip格式允许串联的gzip流.

That works in the sense of creating and maintaining a valid gzip file, since the gzip format permits concatenated gzip streams.

但是,从获得糟糕的压缩效果来看,这是行不通的,因为给gzip压缩的每个实例提供的数据很少.压缩取决于利用先前数据的历史记录,但是在这里gzip基本上没有给出.

However it doesn't work in the sense that you get lousy compression, since you are giving each instance of gzip compression so little data to work with. Compression depends on taking advantage the history of previous data, but here gzip has been given essentially none.

您可以:a)在调用gzip将另一个gzip流添加到文件之前,累积至少几千个数据,包括许多行,或者b)做一些更复杂的事情,以附加到单个gzip流中,每次都留下有效的gzip流,并允许有效压缩数据.

You could either a) accumulate at least a few K of data, many of your lines, before invoking gzip to add another gzip stream to the file, or b) do something much more sophisticated that appends to a single gzip stream, leaving a valid gzip stream each time and permitting efficient compression of the data.

您可以在 gzlog.h中找到C中b)的示例. gzlog.c .我不认为Python具有直接在Python中实现gzlog所需的所有zlib接口,但是您可以从Python连接到C代码.

You find an example of b) in C, in gzlog.h and gzlog.c. I do not believe that Python has all of the interfaces to zlib needed to implement gzlog directly in Python, but you could interface to the C code from Python.

这篇关于Python Gzip-即时添加文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆