Python解压缩字节流? [英] Python unzipping stream of bytes?
本文介绍了Python解压缩字节流?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这是情况:
-
我从Amazon S3获取gzip压缩的xml文档
import boto
从boto.s3.connection import S3Connection
从boto.s3.key import Key
conn = S3Connection('访问Id','秘密访问密钥')
b = conn.get_bucket('mydev.myorg')
k = Key(b)
k.key('documents / document.xml.gz ')
-
我在文件中读取它们
import gzip
$
f = open('/ tmp / p','w')
k.get_file(f)
f。 close()
r = gzip.open('/ tmp / p','rb')
file_content = r.read()
r.close()
如何直接解压缩流和读取内容?
我不想创建临时文件,他们看起来不好。
解决方案是的,您可以使用
zlib
模块解压缩字节流:import zlib
def stream_gzip_decompress(stream):
dec = zlib.decompressobj(32 + zlib.MAX_WBITS)#offset 32跳过流中的块的标头
:
rv = dec.decompress(chunk)
如果rv:
yield rv
32个信号偏移到
zlib
头,表示gzip标头预期但跳过。
S3键对象是一个迭代器,因此您可以对stream_gzip_decompress(k)中的数据执行:
解压缩数据
Here is the situation:
I get gzipped xml documents from Amazon S3
import boto from boto.s3.connection import S3Connection from boto.s3.key import Key conn = S3Connection('access Id', 'secret access key') b = conn.get_bucket('mydev.myorg') k = Key(b) k.key('documents/document.xml.gz')
I read them in file as
import gzip f = open('/tmp/p', 'w') k.get_file(f) f.close() r = gzip.open('/tmp/p', 'rb') file_content = r.read() r.close()
Question
How can I unzip the streams directly and read the contents?
I do not want to create temp files, they don't look good.
解决方案Yes, you can use the
zlib
module to decompress byte streams:import zlib def stream_gzip_decompress(stream): dec = zlib.decompressobj(32 + zlib.MAX_WBITS) # offset 32 to skip the header for chunk in stream: rv = dec.decompress(chunk) if rv: yield rv
The offset of 32 signals to the
zlib
header that the gzip header is expected but skipped.The S3 key object is an iterator, so you can do:
for data in stream_gzip_decompress(k): # do something with the decompressed data
这篇关于Python解压缩字节流?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文