拆分输入到基于内容的多个输出? [英] Split input into multiple outputs based on content?
问题描述
让我们假设有看起来像这样的文件:
Let's assume there is a file which looks like this:
xxxx aa whatever
yyyy bb whatever
zzzz aa whatever
我想它拆分成2个文件,包括:
I'd like split it into 2 files, containing:
第一:
xxxx aa whatever
zzzz aa whatever
第二:
yyyy bb whatever
即。我想组的行基于线(规则可以是:第2个字用空格隔开)一定的价值,但不组内重新排列线
I.e. I want to group the rows based on some value in the lines (rule can be: 2nd word separated by spaces), but do not reorder the lines within groups.
当然,我可以写一个程序来做到这一点,但我不知道是否有任何现成的工具,可以做这样的事情?
Of course I can write a program to do it, but I'm wondering if there is any ready tool that can do something like this?
对不起,我没有提到它,因为我认为这是pretty明显的 - 许多不同的字是巨大的。我们正在谈论其中至少有10000。即基于对前手的话列举的任何解决方案将无法工作。
Sorry, I didn't mention it, as I assumed it's pretty obvious - number of different "words" is huge. we are talking about at least 10000 of them. I.e. any solution based on enumeration of the words before hand will not work.
和还 - 我不喜欢多道分 - 有问题的文件通常是pretty大
And also - I wouldn't really like multi-pass split - the files in question are usually pretty big.
推荐答案
这将创建一个名为文件 output.aa
, output.bb
等。
This will create files named output.aa
, output.bb
, etc.:
awk '{print >> "output." $2}' input.file
这篇关于拆分输入到基于内容的多个输出?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!