如何用awk比较两个csv文件的两列? [英] How to compare two columns of two csv files with awk?

查看:212
本文介绍了如何用awk比较两个csv文件的两列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要将两个csv文件与一列进行比较.

I have two csv files I need to compare against one column.

我的member.csv文件如下:

ID|lastName|firstName
01|Lastname01|Firstname01
02|Lastname02|Firstname02

第二个文件check-ID.csv如下:

Lastname01|Name01|pubID01|Hash01
Lastname02|Name02|pubID02|Hash02a
Lastname03|Name03|pubID03|Hash03
Lastname02|Name02|pubID02|Hash02b
Lastname01|Name01|pubID01|Hash01b

-> Lastname03不在我的member.csv中!

--> Lastname03 is not in my member.csv !

我要检查的是check-ID.csv的第一列的值是否等于member.csv中的第二列的值.

What I want is to check if the value of the first column of check-ID.csv is equal to value of second column in member.csv.

我尝试使用script.awk

NR==FNR{a[$1]=$1; b[$1]=$0; next} 
$2==a[$1]{ delete b[$1]}

END{for (i in b ) print b[i]}

执行

awk -f script.awk check-ID.csv member.csv

问题在于结果未过滤.

我喜欢获得经过筛选和排序的输出,因此仅列出成员,如下所示:

I like to get a filtered and sorted output so only members are listed like this:

Lastname01|Name01|pubID01|Hash01
Lastname01|Name01|pubID01|Hash01b
Lastname02|Name02|pubID02|Hash02a
Lastname02|Name02|pubID02|Hash02b

任何帮助表示赞赏!

推荐答案

能否请您尝试以下操作.我认为您接近的唯一事情是可以更改Input_files的读取顺序.我先读members Input_file,然后再读check-ID.csv的地方,因为以后Input_file包含需要打印的所有详细信息,我们只需要检查成员Input_file的第二个字段即可.

Could you please try following. I think you were close only thing is you could change your Input_files reading sequence. Where I am reading members Input_file first and then check-ID.csv because later Input_file has all details in it which needs to be printed and we need to only check for 2nd field from members Input_file.

awk '
BEGIN{
  FS="|"
}
FNR==NR{
  a[$2]
  next
}
($1 in a)
' members.csv check-ID.csv | 
sort -t'|' -k1

说明: 添加以上详细说明.

Explanation: Adding detailed explanation for above.

awk '                             ##Starting awk program from here.
BEGIN{                            ##Starting BEGIN section of this program from here.
  FS="|"                          ##Setting field separator as | here.
}
FNR==NR{                          ##Checking condition if FNR==NR which will be TRUE when first Input_file named members.csv is being read.
  a[$2]                           ##Creating array a with index 2nd field here.
  next                            ##next will skip all further statements from here.
}
($1 in a)                         ##Checking condition if 1st field is preent in a then print that line.
' members.csv check-ID.csv |      ##Mentioning Input_file names here and sending its output to sort command.
sort -t'|' -k1                    ##Sorting output(which we got from awk command above) by setting separator as | and by first field.

这篇关于如何用awk比较两个csv文件的两列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆