比较不同文件中的列并打印不匹配的列 [英] compare columns from different files and print those that DO NOT match
问题描述
我有两个文件,文件1和文件2.我想比较file1的几列-$ 1,$ 2,$ 3和$ 4与file2的几列$ 1,$ 2,$ 3和$ 4,并打印出与file1中的任何行都不匹配的file2的行.
I have two files, file1 and file2. I want to compare several columns - $1,$2 ,$3 and $4 of file1 with several columns $1,$2, $3 and $4 of file2 and print those rows of file2 that do not match any row in file1.
例如
file1
aaa bbb ccc 1 2 3
aaa ccc eee 4 5 6
fff sss sss 7 8 9
file2
aaa bbb ccc 1 f a
mmm nnn ooo 1 d e
aaa ccc eee 4 a b
ppp qqq rrr 4 e a
sss ttt uuu 7 m n
fff sss sss 7 5 6
我想将其作为输出:
mmm nnn ooo 1 d e
ppp qqq rrr 4 e a
sss ttt uuu 7 m n
我在这里看到了一些问题,这些问题是寻找不匹配的问题并打印出来的,反之则不匹配的问题.
I have seen questions asked here for finding those that do match and printing them, but not viceversa,those that DO NOT match.
谢谢!
推荐答案
使用以下脚本:
awk '{k=$1 FS $2 FS $3 FS $4} NR==FNR{a[k]; next} !(k in a)' file1 file2
k
是列 1
, 2
, 3
和 4
,由 FS
分隔(请参见
k
is the concatenated value of the columns 1
, 2
, 3
and 4
, delimited by FS
(see comments), and will be used as a key in a search array a
later. NR==FNR
is true
while reading file1
. I'm creating the array a
indexed by k
while reading file1
.
对于其余的输入行,我用!(a中的k)
检查是否在 a
中不存在索引.如果该结果为 true
awk
将打印该行.
For the remaining lines of input I check with !(k in a)
if the index does not exists in a
. If that evaluates to true
awk
will print that line.
这篇关于比较不同文件中的列并打印不匹配的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!