打印包含特定列中的值的行,该列由另一个列中的多个实体共享 [英] Print lines that contain a value in a specific column shared by more than 1 entity in another col

查看:71
本文介绍了打印包含特定列中的值的行,该列由另一个列中的多个实体共享的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只想提取第2列中至少由第2列中至少2个唯一值共享的那些值.

I want to extract only those values in Column 2 that are shared by at least 2 unique values in Column 2.

使用相同的输入(在本例中为3个制表符分隔的列):

Using the same input (in this case 3- tab-separated columns):

waterline-n    below-sheath-v    14.8097 
dock-n    below-sheath-v     14.5095 
waterline-n    below-steel-n    11.0330 
picnic-n    below-steel-n    12.2277 
wavefront-n    at-part-of-variance-n    18.4888 
wavefront-n    between-part-of-variance-n    17.0656
audience-b    between-part-of-variance-n    17.6346 
game-n    between-part-of-variance-n    14.9652 
whereabouts-n    become-rediscovery-n    11.3556 
whereabouts-n    get-tee-n    10.9091

对于以下所需输出:

waterline-n    below-sheath-v    14.8097 
dock-n    below-sheath-v     14.5095 
waterline-n    below-steel-n    11.0330
picnic-n    below-steel-n    12.2277 
wavefront-n    between-part-of-variance-n    17.0656 
audience-b    between-part-of-variance-n    17.6346 
game-n    between-part-of-variance-n    14.9652

是否可以使用grep做到这一点?

Is it possible to do this using grep?

推荐答案

使用awk并使用数组两次读取文件.
我认为仅使用grep很难做到这一点.

Reading the file twice with awk and using array.
I think this would be hard to do with grep only.

awk 'FNR==NR {a[$2]++;next} a[$2]>1' file file
waterline-n    below-sheath-v    14.8097
dock-n    below-sheath-v     14.5095
waterline-n    below-steel-n    11.0330
picnic-n    below-steel-n    12.2277
wavefront-n    between-part-of-variance-n    17.0656
audience-b    between-part-of-variance-n    17.6346
game-n    between-part-of-variance-n    14.9652

在第一遍FNR==NR中,它会将数组中第2列的所有值相加,并为通过的每个匹配增加它.
在第二遍中,它在数组中查找并查看点击数是否超过一,如果可以,请打印该行.

In first pass FNR==NR it adds all the value of column 2 in an array, and increment it for every hits that passes.
In pass two it looks in the array and see if hits is more than one and if ok, print the line.

这篇关于打印包含特定列中的值的行,该列由另一个列中的多个实体共享的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆