重击:在X列中保留所有行，重复值 [英] Bash: Keep all lines with duplicate values in column X

查看：79 发布时间：2020/11/12 21:23:30 bash awk

本文介绍了重击:在X列中保留所有行，重复值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含数千行和20多个列的文件.现在，我只希望在第3列中保留与其他行中具有相同电子邮件地址的行.

I have a file with a few thousand lines and 20+ columns. I now want to keep only the lines that have the same e-mail address in column 3 as in other lines.

文件:(名字；姓氏；电子邮件； ...)

file: (First Name; Last Name; E-Mail; ...)

Mike;Tyson;mike@tyson.com
Tom;Boyden;tom@boyden.com
Tom;Cruise;mike@tyson.com
Mike;Myers;mike@tyson.com
Jennifer;Lopez;jennifer@lopez.com
Andre;Agassi;tom@boyden.com
Paul;Walker;paul@walker.com

我要保留所有具有匹配电子邮件地址的行.在这种情况下，预期的输出将是

I want to keep ALL lines that have a matching e-mail address. In this case the expected output would be

Mike;Tyson;mike@tyson.com
Tom;Boyden;tom@boyden.com
Tom;Cruise;mike@tyson.com
Mike;Myers;mike@tyson.com
Andre;Agassi;tom@boyden.com

如果我使用

awk -F';' '!seen[$3]++' file

在第1行和第2行中，我将丢失电子邮件地址的第一个实例，并且仅保留重复项.

I will lose the first instance of the e-mail address, in this case line 1 and 2 and will keep ONLY the duplicates.

是否可以保留所有行?

推荐答案

如果输出顺序无关紧要，则可以采用一种方法:

If the output order doesn't matter, here's a one-pass approach:

$ awk -F';' '$3 in first{print first[$3] $0; first[$3]=""; next} {first[$3]=$0 ORS}' file
Mike;Tyson;mike@tyson.com
Tom;Cruise;mike@tyson.com
Mike;Myers;mike@tyson.com
Tom;Boyden;tom@boyden.com
Andre;Agassi;tom@boyden.com

这篇关于重击:在X列中保留所有行，重复值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

重击:在X列中保留所有行，重复值 [英] Bash: Keep all lines with duplicate values in column X

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

重击:在X列中保留所有行，重复值 [英] Bash: Keep all lines with duplicate values in column X

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭