重击:在X列中保留所有行,重复值 [英] Bash: Keep all lines with duplicate values in column X

查看:79
本文介绍了重击:在X列中保留所有行,重复值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含数千行和20多个列的文件.现在,我只希望在第3列中保留与其他行中具有相同电子邮件地址的行.

I have a file with a few thousand lines and 20+ columns. I now want to keep only the lines that have the same e-mail address in column 3 as in other lines.

文件:(名字;姓氏;电子邮件; ...)

file: (First Name; Last Name; E-Mail; ...)

Mike;Tyson;mike@tyson.com
Tom;Boyden;tom@boyden.com
Tom;Cruise;mike@tyson.com
Mike;Myers;mike@tyson.com
Jennifer;Lopez;jennifer@lopez.com
Andre;Agassi;tom@boyden.com
Paul;Walker;paul@walker.com

我要保留所有具有匹配电子邮件地址的行.在这种情况下,预期的输出将是

I want to keep ALL lines that have a matching e-mail address. In this case the expected output would be

Mike;Tyson;mike@tyson.com
Tom;Boyden;tom@boyden.com
Tom;Cruise;mike@tyson.com
Mike;Myers;mike@tyson.com
Andre;Agassi;tom@boyden.com

如果我使用

awk -F';' '!seen[$3]++' file

在第1行和第2行中,我将丢失电子邮件地址的第一个实例,并且仅保留重复项.

I will lose the first instance of the e-mail address, in this case line 1 and 2 and will keep ONLY the duplicates.

是否可以保留所有行?

推荐答案

如果输出顺序无关紧要,则可以采用一种方法:

If the output order doesn't matter, here's a one-pass approach:

$ awk -F';' '$3 in first{print first[$3] $0; first[$3]=""; next} {first[$3]=$0 ORS}' file
Mike;Tyson;mike@tyson.com
Tom;Cruise;mike@tyson.com
Mike;Myers;mike@tyson.com
Tom;Boyden;tom@boyden.com
Andre;Agassi;tom@boyden.com

这篇关于重击:在X列中保留所有行,重复值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆