如何用awk比较两个csv文件的两列? [英] How to compare two columns of two csv files with awk?
问题描述
我需要将两个csv文件与一列进行比较.
I have two csv files I need to compare against one column.
我的member.csv
文件如下:
ID|lastName|firstName
01|Lastname01|Firstname01
02|Lastname02|Firstname02
第二个文件check-ID.csv
如下:
Lastname01|Name01|pubID01|Hash01
Lastname02|Name02|pubID02|Hash02a
Lastname03|Name03|pubID03|Hash03
Lastname02|Name02|pubID02|Hash02b
Lastname01|Name01|pubID01|Hash01b
-> Lastname03
不在我的member.csv
中!
--> Lastname03
is not in my member.csv
!
我要检查的是check-ID.csv
的第一列的值是否等于member.csv
中的第二列的值.
What I want is to check if the value of the first column of check-ID.csv
is equal to value of second column in member.csv
.
我尝试使用script.awk
是
NR==FNR{a[$1]=$1; b[$1]=$0; next}
$2==a[$1]{ delete b[$1]}
END{for (i in b ) print b[i]}
执行
awk -f script.awk check-ID.csv member.csv
问题在于结果未过滤.
我喜欢获得经过筛选和排序的输出,因此仅列出成员,如下所示:
I like to get a filtered and sorted output so only members are listed like this:
Lastname01|Name01|pubID01|Hash01
Lastname01|Name01|pubID01|Hash01b
Lastname02|Name02|pubID02|Hash02a
Lastname02|Name02|pubID02|Hash02b
任何帮助表示赞赏!
推荐答案
能否请您尝试以下操作.我认为您接近的唯一事情是可以更改Input_files的读取顺序.我先读members
Input_file,然后再读check-ID.csv
的地方,因为以后Input_file包含需要打印的所有详细信息,我们只需要检查成员Input_file的第二个字段即可.
Could you please try following. I think you were close only thing is you could change your Input_files reading sequence. Where I am reading members
Input_file first and then check-ID.csv
because later Input_file has all details in it which needs to be printed and we need to only check for 2nd field from members Input_file.
awk '
BEGIN{
FS="|"
}
FNR==NR{
a[$2]
next
}
($1 in a)
' members.csv check-ID.csv |
sort -t'|' -k1
说明: 添加以上详细说明.
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this program from here.
FS="|" ##Setting field separator as | here.
}
FNR==NR{ ##Checking condition if FNR==NR which will be TRUE when first Input_file named members.csv is being read.
a[$2] ##Creating array a with index 2nd field here.
next ##next will skip all further statements from here.
}
($1 in a) ##Checking condition if 1st field is preent in a then print that line.
' members.csv check-ID.csv | ##Mentioning Input_file names here and sending its output to sort command.
sort -t'|' -k1 ##Sorting output(which we got from awk command above) by setting separator as | and by first field.
这篇关于如何用awk比较两个csv文件的两列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!