在python中选择性提取和打开zipfile [英] Selective extracting and opening for zipfile in python

查看:75
本文介绍了在python中选择性提取和打开zipfile的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从文档中看来,可以使用本地python 使用

From the docs, it looks like it's possible to perform selective file extract and open using the zipfile module in native python, http://docs.python.org/2/library/zipfile using

ZipFile.extract(member [,path [,pwd]])

ZipFile.extract(member[, path[, pwd]])

从存档中将成员提取到当前工作目录;成员必须是其全名或ZipInfo对象).它的档案信息将尽可能准确地提取出来.路径指定一个提取到不同的目录.成员可以是文件名或ZipInfo对象.pwd是用于加密文件的密码.

Extract a member from the archive to the current working directory; member must be its full name or a ZipInfo object). Its file information is extracted as accurately as possible. path specifies a different directory to extract to. member can be a filename or a ZipInfo object. pwd is the password used for encrypted files.

我有一个zipfile,例如 foobar.zip :

I have a zipfile as such foobar.zip:

foobar.zip\
  \foo
      \a.txt
      \b.txt
  \bar
      \b.txt
      \c.txt

我尝试从.zip文件的单个子目录中提取文件,但有时不打印任何内容:

I've tried to extract files from a single sub-directory of the .zip file but it prints nothing sometimes:

import zipfile
with zipfile.ZipFile('foobar.zip','r') as inzipfile:
  for infile in inzipfile.namelist():
    if 'foo' in os.path.split(infile)[0]:
      print inzipfile.open(infile,'r').read()

我试图给出一些可能要提取的选定文件的列表,但有时也什么也不打印.

I've tried to give a list of selected files that i might want to extract but it also prints nothing sometimes too.

wanted = ['a.txt', 'b.txt']
import zipfile
with zipfile.ZipFile('foobar.zip','r') as inzipfile:
  for infile in inzipfile.namelist():
    if os.path.split(infile)[1] in wanted:
      print inzipfile.open(infile,'r').read()

代码或我读取文件的方式都没错.我认为我的zip文件有问题,导致 schroedinbug 有时在我的子目录中目录文件无法打开,并且 inzipfile.open(infile,'r').read()返回None.现在,它将提取,打开并打印文件的内容.

Edited: There's nothing wrong with the code or how I'm reading the files. I think there's something wrong with my zipfile which causes schroedinbug where sometimes my sub-directory files don't open and inzipfile.open(infile,'r').read() returns None. Now it extracts, opens and print the content of the file.

是否知道如何在python代码中检查是否可以使用上述选择性提取/打开方法打开.zip文件中的所有文件?

我还能如何选择性地解压缩/打开zipfile?还有更多的pythonic方法吗?

How else can I perform selective extract/open of zipfiles? Is there a more pythonic method?

推荐答案

您的代码存在 错误.它正在打开并读取也在 inzipfile.namelist()中的文件夹名称.您可以通过以下方式简单地看到它:

There is something wrong with your code. It's opening and reading the folder names which are also in inzipfile.namelist(). You can see this by simply:

print inzipfile.namelist()

哪个会输出:

['foobar/bar/', 'foobar/bar/b.txt', 'foobar/bar/c.txt', 'foobar/foo/', 
 'foobar/foo/a.txt', 'foobar/foo/b.txt', 'foobar/']

查看它的另一种方法是使用 inzipfile.printdir(),这将导致打印以下几行内容:

Another way to see it is withinzipfile.printdir()which should result in something along the following lines being printed:

File Name                                             Modified             Size
foobar/bar/                                    2014-01-12 08:53:36            0
foobar/bar/b.txt                               2014-01-12 08:54:08           60
foobar/bar/c.txt                               2014-01-12 08:54:28           60
foobar/foo/                                    2014-01-12 08:53:02            0
foobar/foo/a.txt                               2014-01-12 08:55:04           60
foobar/foo/b.txt                               2014-01-12 08:55:24           60
foobar/                                        2014-01-12 08:52:32            0

请注意,在两种情况下,所有文件夹条目的名称均以/字符结尾.您可以使用它作为检测它们的简单方法:

Notice that in both cases the name of all folder entries end with a/character. You can use that as a simple way to detect them:

import os
import zipfile

with zipfile.ZipFile('foobar.zip', 'r') as inzipfile:
    for infile in (name for name in inzipfile.namelist() if name[-1] != '/'):
        if 'foo' in os.path.split(infile)[0]:
            print inzipfile.open(infile,'r').read(),

与之类似:

wanted = {'a.txt', 'b.txt'}  # use a set, it's faster for testing membership
import zipfile
with zipfile.ZipFile('foobar.zip','r') as inzipfile:
    for infile in (name for name in inzipfile.namelist() if name[-1] != '/'):
        if os.path.split(infile)[1] in wanted:
          print inzipfile.open(infile,'r').read()

我想想想检查档案文件的所有[file]成员是否都可以打开的唯一方法是实际尝试对每个文件进行打开:

The only way I can think of to check if all the [file] members of an archive can be opened, is to actually try doing it to each one:

def check_files(zipfilename):
    """ Check and see if all members of a .zip archive can be opened.
        Beware of vacuous truth - all members of an empty archive can be opened
    """
    def can_open(archive, membername):
        try:
            archive.open(membername, 'r')  # return value ignored
        except (RuntimeError, zipfile.BadZipfile, zipfile.LargeZipFile):
            return False
        return True

    with zipfile.ZipFile(zipfilename, 'r') as archive:
        return all(can_open(archive, membername)
                    for membername in (
                        name for name in archive.namelist() if name[-1] != '/'))

这篇关于在python中选择性提取和打开zipfile的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆