如何拆分文件,并保留每一部分的第一行? [英] How to split a file and keep the first line in each of the pieces?
问题描述
给定:一个具有特殊第一行的大文本数据文件(例如CSV格式)(例如,字段名称)。
Given: One big text-data file (e.g. CSV format) with a 'special' first line (e.g., field names).
想要:等同于coreutils split -l </ code>命令,但另外要求从原始文件中的标题行显示在每个结果的开始。
Wanted: An equivalent of the coreutils split -l
command, but with the additional requirement that the header line from the original file appear at the beginning of each of the resulting pieces.
我猜测一些 split
和
I am guessing some concoction of split
and head
will do the trick?
推荐答案
这是 robhruska的脚本清理一点点:
tail -n +2 file.txt | split -l 4 - split_
for file in split_*
do
head -n 1 file.txt > tmp_file
cat $file >> tmp_file
mv -f tmp_file $file
done
我删除了 wc
, cut
, ls
和 echo
在不必要的地方。我改变了一些文件名,使它们更有意义。我打破了多行,只是为了让它更容易阅读。
I removed wc
, cut
, ls
and echo
in the places where they're unnecessary. I changed some of the filenames to make them a little more meaningful. I broke it out onto multiple lines only to make it easier to read.
如果你想要花哨,你可以使用 mktemp
或 tempfile
创建一个临时文件名,而不是使用硬编码。
If you want to get fancy, you could use mktemp
or tempfile
to create a temporary filename instead of using a hard coded one.
编辑
使用GNU split
可以这样做:
Using GNU split
it's possible to do this:
split_filter () { { head -n 1 file.txt; cat; } > "$FILE"; }; export -f split_filter; tail -n +2 file.txt | split --lines=4 --filter=split_filter - split_
分解为可读性:
split_filter () { { head -n 1 file.txt; cat; } > "$FILE"; }
export -f split_filter
tail -n +2 file.txt | split --lines=4 --filter=split_filter - split_
当 -filter
被指定, split
运行每个输出文件的命令(在这种情况下必须导出,必须导出),并将变量 FILE
,在命令的环境中,到文件名。
When --filter
is specified, split
runs the command (a function in this case, which must be exported) for each output file and sets the variable FILE
, in the command's environment, to the filename.
过滤器脚本或函数可以执行任何想要的操作输出内容甚至文件名。后者的一个例子可能是输出到变量目录中的固定文件名:>例如$ FILE / data.dat
。
A filter script or function could do any manipulation it wanted to the output contents or even the filename. An example of the latter might be to output to a fixed filename in a variable directory: > "$FILE/data.dat"
for example.
这篇关于如何拆分文件,并保留每一部分的第一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!