Python解压缩字节流? [英] Python ungzipping stream of bytes?

查看:120
本文介绍了Python解压缩字节流?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

情况如下:

  • 我从 Amazon S3 获得了 gzipped xml 文档

  • I get gzipped xml documents from Amazon S3

  import boto
  from boto.s3.connection import S3Connection
  from boto.s3.key import Key
  conn = S3Connection('access Id', 'secret access key')
  b = conn.get_bucket('mydev.myorg')
  k = Key(b)
  k.key('documents/document.xml.gz')

  • 我在文件中读取它们

  • I read them in file as

      import gzip
      f = open('/tmp/p', 'w')
      k.get_file(f)
      f.close()
      r = gzip.open('/tmp/p', 'rb')
      file_content = r.read()
      r.close()
    

  • 问题

    如何直接解压缩流并读取内容?

    How can I ungzip the streams directly and read the contents?

    我不想创建临时文件,它们看起来不太好.

    I do not want to create temp files, they don't look good.

    推荐答案

    是的,您可以使用 zlib 模块解压字节流:

    Yes, you can use the zlib module to decompress byte streams:

    import zlib
    
    def stream_gzip_decompress(stream):
        dec = zlib.decompressobj(32 + zlib.MAX_WBITS)  # offset 32 to skip the header
        for chunk in stream:
            rv = dec.decompress(chunk)
            if rv:
                yield rv
    

    zlib 标头发出 32 个信号的偏移量,即预期 gzip 标头但已跳过.

    The offset of 32 signals to the zlib header that the gzip header is expected but skipped.

    S3 密钥对象是一个迭代器,因此您可以:

    The S3 key object is an iterator, so you can do:

    for data in stream_gzip_decompress(k):
        # do something with the decompressed data
    

    这篇关于Python解压缩字节流?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆