如果行(特定字段)匹配,则awk列的平均部分 [英] awk average part of column if lines (specific field) match
问题描述
这是我的输入文件的示例:
Here is a sample of my input file :
$cat NDVI-bm
P01 031.RAW 0.516 0 0
P01 021.RAW 0.449 0 0
P02 045.RAW 0.418 0 0
P03 062.RAW 0.570 0 0
P03 064.RAW 0.469 0 0
P04 083.RAW 0.636 0 0
P04 081.RAW 0.592 0 0
P04 082.RAW 0.605 0 0
P04 084.RAW 0.648 0 0
P05 093.RAW 0.748 0 0
如果第一个字段匹配,我需要对第3列取平均值.很简单,但是由于我的awk知识非常基础,所以我很挣扎...这就是我到目前为止所拥有的:
I need to average column 3 if first field match. Simple enough, but I'm struggling as my awk knowledges are quite basics... Here is what I have so far :
awk '{array[$1]+=$3(need to divide here by number of matches...)} END { for (i in array) {print i"," array[i]}}' NDVI-bm
通过搜索网络,我真的不确定自己的前进方向是否正确...除非有一种简便的方法来计算比赛的次数,但我似乎找不到...任何想法?
By searching the web, I'm really not sure I'm heading in the right way... unless there is an easy way to count the number of matches, wich I can't seem to find... Any ideas ?
感谢您的帮助!
推荐答案
例如,计算以"P01"开头的行的平均值:
E.g to calculate the average of lines starting with "P01":
/^P01/{
num+=1
cnt+=$3
}
END {print "avg = " cnt/num}
输出:
$ awk -f avg.awk input
avg = 0.4825
...或者,作为一个单行:
...or, as a oneliner:
$ awk '/^P01/{cnt+=$3; num+=1} END{print "avg="cnt/num}' input
或者同时对第一列的所有值进行计算:
Or to do the calculations for all values of the first column simultaneously:
{
sum[$1]+=$3
cnt[$1]++
}
END {
print "Name" "\t" "sum" "\t" "cnt" "\t" "avg"
for (i in sum)
print i "\t" sum[i] "\t" cnt[i] "\t" sum[i]/cnt[i]
}
输出:
$ awk -f avg.awk input
Name sum cnt avg
P01 0.965 2 0.4825
P02 0.418 1 0.418
P03 1.039 2 0.5195
P04 2.481 4 0.62025
P05 0.748 1 0.748
这篇关于如果行(特定字段)匹配,则awk列的平均部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!