如何检查它是python中归档文件的文件还是文件夹? [英] How to check if it is a file or folder for an archive in python?

查看:146
本文介绍了如何检查它是python中归档文件的文件还是文件夹?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个我不想提取的存档,但是要检查其每个内容是文件还是目录.

I have an archive which I do not want to extract but check for each of its contents whether it is a file or a directory.

os.path.isdir和os.path.isfile不起作用,因为我正在处理存档.存档可以是tar,bz2,zip或tar.gz中的任何一个(因此我无法使用其特定的库).另外,该代码应可在任何平台(例如linux或Windows)上运行.有人可以帮我怎么做吗?

os.path.isdir and os.path.isfile do not work because I am working on archive. The archive can be anyone of tar,bz2,zip or tar.gz(so I cannot use their specific libraries). Plus, the code should work on any platform like linux or windows. Can anybody help me how to do it?

推荐答案

您已经声明需要支持"tar,bz2,zip或tar.gz". Python的tarfile模块将自动处理gz和bz2压缩的tar文件,因此实际上只需要支持两种类型的存档:tar和zip. (bz2本身不是存档格式,只是压缩).

You've stated that you need to support "tar, bz2, zip or tar.gz". Python's tarfile module will automatically handle gz and bz2 compressed tar files, so there is really only 2 types of archive that you need to support: tar and zip. (bz2 by itself is not an archive format, it's just compression).

您可以使用tarfile.is_tarfile()确定给定文件是否为tar文件.这也适用于以gzip或bzip2压缩方式压缩的tar文件.在tar文件中,您可以使用TarInfo.isdir()确定文件是目录,还是使用TarInfo.isfile()确定文件是文件.

You can determine whether a given file is a tar file with tarfile.is_tarfile(). This will also work on tar files compressed with gzip or bzip2 compression. Within a tar file you can determine whether a file is a directory using TarInfo.isdir() or a file with TarInfo.isfile().

类似地,您可以使用zipfile.is_zipfile()确定文件是否为zip文件.对于zipfile,没有方法可以将目录与普通文件区分开,但是以/结尾的文件是目录.

Similarly you can determine whether a file is a zip file using zipfile.is_zipfile(). With zipfile there is no method to distinguish directories from normal file, but files that end with / are directories.

因此,给定文件名,您可以执行以下操作:

So, given a file name, you can do this:

import zipfile
import tarfile

filename = 'test.tgz'

if tarfile.is_tarfile(filename):
    f = tarfile.open(filename)
    for info in f:
        if info.isdir():
            file_type = 'directory'
        elif info.isfile():
            file_type = 'file'
        else:
            file_type = 'unknown'
        print('{} is a {}'.format(info.name, file_type))

elif zipfile.is_zipfile(filename):
    f = zipfile.ZipFile(filename)
    for name in f.namelist():
         print('{} is a {}'.format(name, 'directory' if name.endswith('/') else 'file'))

else:
    print('{} is not an accepted archive file'.format(filename))

在具有以下结构的tar文件上运行时:

When run on a tar file with this structure:


(py2)[mhawke@localhost tmp]$ tar tvfz /tmp/test.tgz
drwxrwxr-x mhawke/mhawke     0 2016-02-29 12:38 x/
lrwxrwxrwx mhawke/mhawke     0 2016-02-29 12:38 x/4 -> 3
drwxrwxr-x mhawke/mhawke     0 2016-02-28 21:14 x/3/
drwxrwxr-x mhawke/mhawke     0 2016-02-28 21:14 x/3/4/
-rw-rw-r-- mhawke/mhawke     0 2016-02-28 21:14 x/3/4/zzz
drwxrwxr-x mhawke/mhawke     0 2016-02-28 21:13 x/2/
-rw-rw-r-- mhawke/mhawke     0 2016-02-28 21:13 x/2/aa
drwxrwxr-x mhawke/mhawke     0 2016-02-28 21:13 x/1/
-rw-rw-r-- mhawke/mhawke     0 2016-02-28 21:13 x/1/abc
-rw-rw-r-- mhawke/mhawke     0 2016-02-28 21:13 x/1/ab
-rw-rw-r-- mhawke/mhawke     0 2016-02-28 21:13 x/1/a

输出为:


x is a directory
x/4 is a unknown
x/3 is a directory
x/3/4 is a directory
x/3/4/zzz is a file
x/2 is a directory
x/2/aa is a file
x/1 is a directory
x/1/abc is a file
x/1/ab is a file
x/1/a is a file

请注意x/4是未知"的,因为它是符号链接.

Notice that x/4 is "unknown" because it is a symbolic link.

使用zipfile没有简单的方法来将符号链接(或其他文件类型)与目录或普通文件区分开.信息位于ZipInfo.external_attr属性中,但将其撤回很麻烦:

There is no easy way, with zipfile, to distinguish a symlink (or other file types) from a directory or normal file. The information is there in the ZipInfo.external_attr attribute, but it's messy to get it back out:

import stat

linked_file = f.filelist[1]
is_symlink = stat.S_ISLNK(linked_file.external_attr >> 16L)

这篇关于如何检查它是python中归档文件的文件还是文件夹?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆