如果发现重复行，则打印整行 [英] Print whole lines, when find duplicate

查看：53 发布时间：2021/5/9 20:53:05 awk data-processing

本文介绍了如果发现重复行，则打印整行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我输入的片段:

DGD3 SOL10
DGD53 SOL15
DGD100 SOL15
DGD92 SOL20
DGD41 SOL22
DGD62 SOL35
DGD13 SOL40
DGD13 SOL40

我的预期输出

DGD53 SOL15
DGD100 SOL15
DGD13 SOL40
DGD13 SOL40

在我的数据中，有时我会有SOL重复项(不超过两次重复，例如文件中某些SOL的三倍，而仅仅是重复项).SOL在我的第二列(2美元)中.因此，当我发现重复的SOL($ 2)时，我需要一个可以打印整行(DGD和SOL)的程序.你能帮我吗?

In my data I have sometimes SOL duplicates (not more than two repetitions not for example three times some SOL in a file but only duplicates). SOL is in my second column ($2). So I need a program which print whole line (DGD and SOL) when I find duplicate SOL ($2). Could you help me?

推荐答案

另一个awk.如果第二个字段的实例超过2个，则单次运行无需对文件进行排序，即可正常运行.在最坏的情况下，它会将完整的文件散列到内存中，并且不产生任何输出:

Another awk. Single run, no need for the file to be sorted, works correctly if there are more than 2 instances of the second field. In worst case it hashes the complete file in memory and produces no output:

$ awk '{
    if(!c[$2]++)           # if first instance of $2
        a[$2]=$0           # store it
    else {
        if(c[$2]==2) {     # if second instance 
            print a[$2]    # print previous
            delete a[$2]   # no need to waste my memory any more
        } 
        print              # after first instance of $2 we always print current
    }
}' file

输出:

DGD53 SOL15
DGD100 SOL15
DGD13 SOL40
DGD13 SOL40

这篇关于如果发现重复行，则打印整行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如果发现重复行，则打印整行 [英] Print whole lines, when find duplicate

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如果发现重复行，则打印整行 [英] Print whole lines, when find duplicate

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭