拆分输入到基于内容的多个输出? [英] Split input into multiple outputs based on content?

查看:142
本文介绍了拆分输入到基于内容的多个输出?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们假设有看起来像这样的文件:

Let's assume there is a file which looks like this:

xxxx aa whatever
yyyy bb whatever
zzzz aa whatever

我想它拆分成2个文件,包括:

I'd like split it into 2 files, containing:

第一:

xxxx aa whatever
zzzz aa whatever

第二:

yyyy bb whatever

即。我想组的行基于线(规则可以是:第2个字用空格隔开)一定的价值,但不组内重新排列线

I.e. I want to group the rows based on some value in the lines (rule can be: 2nd word separated by spaces), but do not reorder the lines within groups.

当然,我可以写一个程序来做到这一点,但我不知道是否有任何现成的工具,可以做这样的事情?

Of course I can write a program to do it, but I'm wondering if there is any ready tool that can do something like this?

对不起,我没有提到它,因为我认为这是pretty明显的 - 许多不同的字是巨大的。我们正在谈论其中至少有10000。即基于对前手的话列举的任何解决方案将无法工作。

Sorry, I didn't mention it, as I assumed it's pretty obvious - number of different "words" is huge. we are talking about at least 10000 of them. I.e. any solution based on enumeration of the words before hand will not work.

和还 - 我不喜欢多道分 - 有问题的文件通常是pretty大

And also - I wouldn't really like multi-pass split - the files in question are usually pretty big.

推荐答案

这将创建一个名为文件 output.aa output.bb 等。

This will create files named output.aa, output.bb, etc.:

awk '{print >> "output." $2}' input.file

这篇关于拆分输入到基于内容的多个输出?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆