循环 tar.gz 中的文件 [英] Loop over files in tar.gz
问题描述
我有 files.tar.gz
里面有一堆 csv 文件,每个文件都有一个标题.我想遍历 csv 文件并一次处理一个.如果我只是 zcat |tar -xO
全部,然后我将无法识别每个文件的开头,因此无法识别其标题.
I have files.tar.gz
with a bunch of csv files inside, each having a header. I want to loop over the csv files and process them one at a time. If I just zcat | tar -xO
them all, then I won't be able to identify the start of each file, and thus its header.
如何一次一个地循环存档中的文件,并将它们分别通过管道传送到处理命令?
How can I loop over the files in the archive one at a time, and pipe them individually to a processing command?
推荐答案
GNU tar
实用程序可以使用
The GNU tar
utility can extract individual file from an archive to stdout with
tar -O -x -z -f archive.tgz 文件
假设为每个文件运行 tar
并不会太慢,这里有一个循环可以解决您的问题.
Here is a loop that might solve your problem, assuming running tar
for each file is not prohibitively slow.
tar tzf files.tar.gz | while IFS= read -r f ; do
echo ">>> Processing file $f"
tar Oxzf files.tar.gz "$f" | head | cat -n ;
done
<代码>命令 |而 IFS= read -r line 是一个 BASH 模式.当文件名中有空格时,清除 IFS
变量是代码正常工作所必需的.
command | while IFS= read -r line
is a BASH pattern. Clearing the IFS
variable is necessary for the code to work correctly when there are spaces in the file names.
这篇关于循环 tar.gz 中的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!