AWK:保持与共享领域的最高值的记录,而忽略了其他领域 [英] awk: keep records with the highest value that share a field, while ignoring other fields
问题描述
想象一下,你想保持最高值记录在表中的某一领域,仅仅通过另一个字段定义的类别中进行比较(和无视他人的内容)。
Imagine that you want to keep the records with the highest value in a given field of a table, just comparing within the categories defined by another field (and ignoring the contents of the others).
因此,鉴于输入nye.txt:
So, given the input nye.txt:
X A 10.00
X A 1.50
X B 0.01
X B 4.00
Y C 1.00
Y C 2.43
您所期待这样的输出:
X A 10.00
Y C 2.43
这是这个previous的offshot,相关话题:<一href=\"http://stackoverflow.com/questions/29239080/awk-keep-records-with-the-highest-value-comparing-those-that-share-other-field\">awk:保存记录的最高值,相比那些共享等领域
This is an offshot of this previous, related thread: awk: keep records with the highest value, comparing those that share other fields
我已经有一个解决方案(见下文),但是想法,欢迎!
I already have a solution (see below), but ideas are welcome!
推荐答案
这样的事情是 AWK
awk '$3>=a[$1]{a[$1]=$3; b[$1]=$0} END{for(i in a)print b[i]}' File
对于每个第一列值
(X,Y等),如果第3列的值
大小于或等于previously存储的巨大价值(即 A [$ i]
;最初这将是 0
默认情况下),更新[$ i]本第3列的值
。还保存在数组b中的整行。在 END
块,打印结果。
For each 1st column value
(X, Y etc..), if the 3rd column value
is greater than or equal to the previously stored great value (i.e a[$i]
; initially it will be 0
by default), update a[$i] with this 3rd column value
. Also save the entire line in array b. Within END
block, print the results.
输出:
AMD$ awk '$3>a[$1]{a[$1]=$3; b[$1]=$0} END{for(i in a)print b[i]}' File
X A 10.00
Y C 2.43
这篇关于AWK:保持与共享领域的最高值的记录,而忽略了其他领域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!