Awk-计算每个唯一值并在两个文件之间匹配值 [英] Awk - Count Each Unique Value and Match Values Between Two Files

查看：35 发布时间：2021/5/9 20:45:33 linux awk

本文介绍了Awk-计算每个唯一值并在两个文件之间匹配值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个文件.首先，我尝试获取第4列中每个唯一字段的计数.

I have two files. First I am trying to get the count of each unique field in column 4.

然后匹配第二个文件第二列中的唯一字段值.

And then match the unique field value from the 2nd column of the 2nd file.

文件1-列4的每个唯一值和文件2-列2包含我需要在两个文件之间匹配的值

File1 - column 4's each unique value and File2 - columns 2 contains the value that I need to match between the two files

从本质上讲，我正在尝试->如果file2的column2中有匹配项，则从File1的第4列中获取每个唯一值和值计数

So essentially, I am trying to -> take each unique value and value count from column 4 from File1, if there is a match in column2 of file2

File1

File2

hello "6"

hi "5"

所需的输出

total count of hello,6 : 3

total count of hi,5 : 2

我的测试代码

awk'NR == FNR {a [$ 4] ++} NR！= FNR {gsub(/"/，" ,, $ 2); b [$ 2] = $ 0} END {for(i inb){printf％s的总数，％d:％d \ n"，gensub(/^([^] +).*/，"\ 1"，"1"，b [i])，i，a [i]}}'File1 File2

awk 'NR==FNR{a[$4]++}NR!=FNR{gsub(/"/,"",$2);b[$2]=$0}END{for( i in b){printf "Total count of %s,%d : %d\n",gensub(/^([^ ]+).*/,"\1","1",b[i]),i,a[i]}}' File1 File2

我相信我应该可以使用awk来做到这一点，但是由于某种原因，我确实在为此苦苦挣扎.

I believe I should be able to do this with awk, but for some reason I am really struggling with this one.

谢谢

推荐答案

是的，可以做到的-这里有些冗长的 awk 版本(使用GNU awk及其不兼容POSIX的扩展gensub):

Yes, this can be done - here a somewhat verbose awk version (using GNU awk and its non-POSIX compliant extension gensub):

tink@box ~/tmp$ awk 'NR==FNR{a[$4]++}NR!=FNR{gsub(/"/,"",$2);b[$2]=$0}END{for( i in b){printf "Total count of %s,%d : %d\n",gensub(/^([^ ]+).*/,"\\1","1",b[i]),i,a[i]}}' File1 File2
Total count of hi,5 : 2
Total count of hello,6 : 3

一些解释性单词:

NR == FNR {  # while we're on the first file, count all values in column 4
        a[$4]++
}
NR != FNR { # on the second file, strip the quotes from field two, use 2 as
            # index of the array for the second file
        gsub(/"/, "", $2)
        b[$2] = $0
}
# END rule(s)
END { # after both files were processed, pull a match for every line in the 
      # second array, and print it with the count of the occurrences in File1
        for (i in b) {
                printf "Total count of %s,%d : %d\n", gensub(/^([^ ]+).*/, "\\1", "1", b[i]), i, a[i]
        }
}

这篇关于Awk-计算每个唯一值并在两个文件之间匹配值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Awk-计算每个唯一值并在两个文件之间匹配值 [英] Awk - Count Each Unique Value and Match Values Between Two Files

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

Awk-计算每个唯一值并在两个文件之间匹配值 [英] Awk - Count Each Unique Value and Match Values Between Two Files

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭