如何拆分文件并在每个部分中保留第一行? [英] How to split a file and keep the first line in each of the pieces?

查看:16
本文介绍了如何拆分文件并在每个部分中保留第一行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定:一个大的文本数据文件(例如 CSV 格式),第一行是特殊的"(例如字段名称).

Given: One big text-data file (e.g. CSV format) with a 'special' first line (e.g., field names).

需要:相当于 coreutils split -l 命令,但附加要求原始文件的标题行出现在每个文件的开头结果碎片.

Wanted: An equivalent of the coreutils split -l command, but with the additional requirement that the header line from the original file appear at the beginning of each of the resulting pieces.

我猜一些 splithead 的混合物会起作用吗?

I am guessing some concoction of split and head will do the trick?

推荐答案

这是 robhruska 的 脚本清理了一下:

This is robhruska's script cleaned up a bit:

tail -n +2 file.txt | split -l 4 - split_
for file in split_*
do
    head -n 1 file.txt > tmp_file
    cat "$file" >> tmp_file
    mv -f tmp_file "$file"
done

我在不需要的地方删除了 wccutlsecho.我更改了一些文件名,使它们更有意义.我把它分成多行只是为了更容易阅读.

I removed wc, cut, ls and echo in the places where they're unnecessary. I changed some of the filenames to make them a little more meaningful. I broke it out onto multiple lines only to make it easier to read.

如果你想变得有趣,你可以使用 mktemptempfile 来创建一个临时文件名,而不是使用硬编码的.

If you want to get fancy, you could use mktemp or tempfile to create a temporary filename instead of using a hard coded one.

编辑

使用 GNU split 可以做到这一点:

Using GNU split it's possible to do this:

split_filter () { { head -n 1 file.txt; cat; } > "$FILE"; }; export -f split_filter; tail -n +2 file.txt | split --lines=4 --filter=split_filter - split_

为了可读性而打破:

split_filter () { { head -n 1 file.txt; cat; } > "$FILE"; }
export -f split_filter
tail -n +2 file.txt | split --lines=4 --filter=split_filter - split_

当指定 --filter 时,split 为每个输出文件运行命令(在这种情况下是一个函数,必须导出)并设置变量 FILE,在命令的环境中,到文件名.

When --filter is specified, split runs the command (a function in this case, which must be exported) for each output file and sets the variable FILE, in the command's environment, to the filename.

过滤器脚本或函数可以对输出内容甚至文件名进行任何操作.后者的一个例子可能是输出到可变目录中的固定文件名:>例如$FILE/data.dat".

A filter script or function could do any manipulation it wanted to the output contents or even the filename. An example of the latter might be to output to a fixed filename in a variable directory: > "$FILE/data.dat" for example.

这篇关于如何拆分文件并在每个部分中保留第一行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆