用 awk 比较文件 [英] Compare files with awk

查看:29
本文介绍了用 awk 比较文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我有两个相似的文件(都有 3 列).我想检查这两个文件是否包含相同的元素(但以不同的顺序列出).首先,我只想比较第一列

Hi I have two similar files (both with 3 columns). I'd like to check if these two files contains the same elements (but listed in a different orders). First of all I'd like to compare only the 1st columns

file1.txt

"aba" 0 0 
"abc" 0 1
"abd" 1 1 
"xxx" 0 0

file2.txt

"xyz" 0 0
"aba" 0 0
"xxx" 0 0
"abc" 1 1

我怎样才能使用 awk 做到这一点?我试图环顾四周,但我发现的只是复杂的例子.如果我还想在比较中包括其他两列怎么办?输出应该给我匹配元素的数量.

How can I do it using awk? I tried to have a look around but I've found only complicate examples. What if I want to include also the other two columns on the comparison? The output should give me the number of matching elements.

推荐答案

在两个文件中打印共同的元素:

To print the common elements in both files:

$ awk 'NR==FNR{a[$1];next}$1 in a{print $1}' file1 file2
"aba"
"abc"
"xxx"

解释:

NRFNRawk 变量,分别存储当前文件的记录总数和记录数(默认记录是一行).

NR and FNR are awk variables that store the total number of records and the number of records in the current files respectively (the default record is a line).

NR==FNR # Only true when in the first file 
{
    a[$1] # Build associative array on the first column of the file
    next  # Skip all proceeding blocks and process next line
}
($1 in a) # Check in the value in column one of the second files is in the array
{
    # If so print it
    print $1
}

如果你想匹配整行然后使用 $0:

If you want to match the whole lines then use $0:

$ awk 'NR==FNR{a[$0];next}$0 in a{print $0}' file1 file2
"aba" 0 0
"xxx" 0 0

或一组特定的列:

$ awk 'NR==FNR{a[$1,$2,$3];next}($1,$2,$3) in a{print $1,$2,$3}' file1 file2
"aba" 0 0
"xxx" 0 0

这篇关于用 awk 比较文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆