比较两个文件的单列 [英] compare single column of two files
问题描述
我有两个文件,每个文件有两列用空格分隔.
I have two files, each with two columns separated by a space.
我想找出两个文件中第 2 列不相同的行并将它们输出到第三个文件中.
I'd like to find the lines in which column 2 is not the same in both files and output them to a third file.
文件 A:
1 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
2 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
3 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
4 DDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
5 EEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
6 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
7 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
8 HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
文件 B:
1 AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
2 BBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
3 CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
4 DDDDDDDDDDDDDDDDDDDDDDDDDDDDDD
5 WWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
8 ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
9 EEEEEEEEEEEEEEEEEEEEEEEEEEEEEE
10 FFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
11 GGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
12 HHHHHHHHHHHHHHHHHHHHHHHHHHHHHH
期望的输出:
5 WWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
8 ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
我认为最简单的方法是 grep
文件 B 中文件 A 的每一行,但我是 bash 新手,无法弄清楚下一步.非常感谢任何帮助!
I assumed the easiest way to do this was grep
each line from file A in file B, but I'm new to bash and can't figure out the next step. Any help is greatly appreciated!
推荐答案
你可以使用awk
:
$ awk 'FNR==NR {a[$1]=$2; next} $1 in a && a[$1] != $2' fileA fileB
5 WWWWWWWWWWWWWWWWWWWWWWWWWWWWWW
6 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
7 YYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
8 ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ
它循环遍历存储在数组a[1st col] = 2nd col
中的值的第一个文件.然后,它遍历第二个文件并打印符合这些条件的行:
It loops through the first file storing the values in an array a[1st col] = 2nd col
. Then, it loops through the second file and prints those lines matching these conditions:
- 第一列出现在第一个文件中.
- 第二列值与第一个文件中的值不同.
要将其存储到新文件中,只需将命令重定向到文件:
To store it into a new file, just redirect the command to a file:
awk 'FNR==NR {a[$1]=$2; next} $1 in a && a[$1] != $2' fileA fileB > fileC
^^^^^^^
这篇关于比较两个文件的单列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!