在Python中获取已归档文件夹内容的文件名 [英] Get file names of tarred folder contents in Python

查看:304
本文介绍了在Python中获取已归档文件夹内容的文件名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个名为gziptest.tar.gz的压缩文件夹,其中包含几个纯文本文件.

I have a compressed folder called gziptest.tar.gz which contains several plaintext files.

我希望能够获取文件名和文件的相应内容,但是gzip库的用法示例并未涵盖此内容.

I'd like to be able to get the filenames and corresponding contents of the files, but the examples of usage for the gzip library don't cover this.

以下代码:

import gzip
in_f = gzip.open('/home/cholloway/gziptest.tar.gz')
print in_f.read()

产生输出:

gzip test/file2000664 001750 001750 00000000016 12621163624 015761 0ustar00chollowaycholloway000000 000000 I like apples
gzip test/file1000664 001750 001750 00000000025 12621164026 015755 0ustar00chollowaycholloway000000 000000 hello world
line two
gzip test/000775 001750 001750 00000000000 12621164026 015035 5ustar00chollowaycholloway000000 000000 

我可以使用一些正则表达式来检测新文件的开头并提取文件名,但是我想知道gzip或其他标准python库中是否已存在此功能.

I could use some regular expressions to detect the start of a new file and extract the filename, but I'm wondering if this functionality already exists within gzip or another standard python library.

推荐答案

对于该文件,请勿使用gzip库.使用 tarfile 库.

For that file, don't use the gzip library. Use the tarfile library.

您正在使用的文件是文件test/*的tar归档文件的gzip压缩.

The file you are working with is the gzip-compression of a tar archive of the files test/*.

如果只想恢复tar存档,请使用gzip解压缩文件.生成的文件是(如您所发现的)所需文件的存档.

If you only want to recover the tar archive, then use gzip to uncompress the file. The resulting file is (as you discovered) an archive of the files you want.

从逻辑上讲,如果要访问tar归档文件中的文件,我们必须首先使用gzip库恢复tar归档文件,然后使用tarfile库来恢复文件.

Logically, if you want to access the files inside the tar archive, we must first use the gzip library to recover the tar archive and then use the tarfile library to recover the files.

实际上,我们仅使用tarfile库:tarfile库将代表您自动调用gzip库.

Practically, we only use the tarfile library: the tarfile library will automatically invoke the gzip library on your behalf.

我已从示例部分中复制了此示例tarfile手册页:

I've copied this example from the examples section of the tarfile man page:

import tarfile
tar = tarfile.open("sample.tar.gz")
tar.extractall()
tar.close()

这篇关于在Python中获取已归档文件夹内容的文件名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆