基于价值观在一个特定的列值分裂 [英] splitting values based on values in a specific column
问题描述
我有我想分解成多个文件的uniq值第一列的文件。例如,这里是一个文件:
I have a file that I would like to break up into multiple files with uniq values for the first column. For example, here is a file:
fileA.txt
fileA.txt
1 Cat
1 Dog
1 Frog
2 Boy
2 Girl
3 Tree
3 Leaf
3 Branch
3 Trunk
我想我的输出是这个样子:
I would like my output to look something like this:
FILE1.TXT
file1.txt
1 Cat
2 Boy
3 Tree
FILE2.TXT
file2.txt
1 Dog
2 Girl
3 Leaf
file3.txt
file3.txt
1 Frog
3 Branch
file4.txt
file4.txt
3 Trunk
如果一个值不存在,我希望它被跳过。我试图寻找类似情况下的矿井,但我已经拿出短。有没有人对如何做到这一点的想法?
If a value does not exist, I want it to be skipped. I have tried to search for similar situations to mine, but I've come up short. Does anyone have idea of how to do this?
从理论上讲,这awk命令应该工作:的awk'{打印> 文件++中的[1 $]名.txt}'输入
。但是,我无法得到它的工作适当地(最可能是由于这样的事实,我在Mac上工作)没有人知道的另一种方式?
Theoretically, this awk command should work: awk '{print > "file" ++a[$1] ".txt"}' input
. However, I can't get it to work appropriately (most likely due to the fact that I work on a mac) Does anyone know of an alternative way?
推荐答案
在输出重定向右侧的一个前加括号的pression是不确定的行为。尝试的awk'{打印> (文件++ a [$ 1]名.txt)}'输入
。
An unparenthesized expression on the right side of output redirection is undefined behavior. Try awk '{print > ("file" ++a[$1] ".txt")}' input
.
如果有太多的文件,同时打开是一个问题,然后得到GNU awk的,但如果你不能
If having too many files open concurrently is an issue then get GNU awk, but if you cant:
$ ls
fileA.txt
$ awk '{f="file" ++a[$1] ".txt"; print >> f; close(f)}' fileA.txt
$ ls
file1.txt file2.txt file3.txt file4.txt fileA.txt
$ cat file1.txt
1 Cat
2 Boy
3 Tree
这篇关于基于价值观在一个特定的列值分裂的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!