在python中解压缩嵌套的zip文件 [英] Unzip nested zip files in python

查看:77
本文介绍了在python中解压缩嵌套的zip文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种在 python 中解压缩嵌套 zip 文件的方法.例如,考虑以下结构(为方便起见,假设名称):

I am looking for a way to unzip nested zip files in python. For example, consider the following structure (hypothetical names for ease):

  • 文件夹
    • ZipfileA.zip
      • ZipfileA1.zip
      • ZipfileA2.zip
      • ZipfileB1.zip
      • ZipfileB2.zip

      ...等等.我正在尝试访问第二个 zip 中的文本文件.我当然不想提取所有内容,因为剪切数字会使计算机崩溃(第一层有数百个拉链,第二层(每个拉链)有近 10,000 个拉链).

      ...etc. I am trying to access text files that are within the second zip. I certainly don't want to extract everything, as the shear numbers would crash the computer (there is several hundred zips in the first layer, and almost 10,000 in the second layer (per zip)).

      我一直在玩 'zipfile' 模块 - 我能够打开 zipfiles 的第一级.例如:

      I have been playing around with the 'zipfile' module - I am able open the 1st level of zipfiles. E.g.:

      zipfile_obj = zipfile.ZipFile("/Folder/ZipfileA.zip")
      next_layer_zip = zipfile_obj.open("ZipfileA1.zip")
      

      但是,这会返回一个ZipExtFile"实例(不是文件或 zipfile 实例) - 然后我无法继续打开这个特定的数据类型.我不能这样做:

      However, this returns a "ZipExtFile" instance (not a file or zipfile instance) - and I can't then go on and open this particular data type. That I can't do this:

      data = next_layer_zip.open(data.txt)
      

      但是我可以读取"这个 zip 文件:

      I can however "read" this zip file file with:

      next_layer_zip.read()
      

      但这完全没用!(即只能读取压缩数据/goobledigook).

      But this is entirely useless! (i.e. can only read compressed data/goobledigook).

      有没有人对我如何解决这个问题有任何想法(没有使用 ZipFile.extract)??

      Does anyone have any ideas on how I might go about this (without using ZipFile.extract)??

      我遇到了这个,http://pypi.python.org/pypi/zip_open/ - 看起来完全符合我的要求,但似乎对我不起作用.(对于我正在尝试使用该模块处理的文件,不断收到[Errno 2] 没有这样的文件或目录:").

      I came across this, http://pypi.python.org/pypi/zip_open/ - which looks to do exactly what I want, but it doesn't seem to work for me. (keep getting "[Errno 2] No such file or directory:" for the files I am trying to process, using that module).

      任何想法将不胜感激!提前致谢

      Any ideas would be much appreciated!! Thanks in advance

      推荐答案

      ZipFile 需要一个类似文件的对象,因此您可以使用 StringIO 将您从嵌套 zip 中读取的数据转换为这样的对象.需要注意的是,您将把完整的(仍然是压缩的)内部 zip 加载到内存中.

      ZipFile needs a file-like object, so you can use StringIO to turn the data you read from the nested zip into such an object. The caveat is that you'll be loading the full (still compressed) inner zip into memory.

      with zipfile.ZipFile('foo.zip') as z:
          with z.open('nested.zip') as z2:
              z2_filedata = cStringIO.StringIO(z2.read())
              with zipfile.ZipFile(z2_filedata) as nested_zip:
                  print nested_zip.open('data.txt').read()
      

      这篇关于在python中解压缩嵌套的zip文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆