在不考虑Shell脚本中缺少值的情况下计算平均值? [英] Calculating average without considering missing values in shell script?

查看:85
本文介绍了在不考虑Shell脚本中缺少值的情况下计算平均值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集,其中许多缺失值都为-999.数据的一部分是

I have a dataset with many missing values as -999. Part of the data is

input.txt
30
-999
10
40
23
44
-999
-999
31
-999
54
-999 
-999
-999
-999
-999
-999
-999 and so on

我想在不考虑缺失值的情况下计算每6行间隔的平均值.

I would like calculate the average in each 6 rows interval without considering the missing values.

期望输出是

ofile.txt
29.4
42.5
-999

与此同时,我正在尝试

awk '!/\-999/{sum += $1; count++} NR%6==0{print count ? (sum/count) : count;sum=count=0}' input.txt

它正在给予

29.4
42.5
0

推荐答案

我不确定为什么要取消-999值,为什么-999比零更好呢?第三组的平均值.在前两个组中,-999值既不影响总和,也不影响计数,因此可以说为零是一个更好的选择.

I'm not entirely sure why, if you're discounting -999 values, you'd think that -999 was a better choice than zero for the average of the third group. In the first two groups, the -999 values contribute to neither the sum nor the count, so an argument could be made that zero is a better choice.

但是,可能是您希望-999表示缺乏价值"(在组中没有价值的情况下肯定会发生这种情况).在这种情况下,您只需在原始代码中输出-999而不是count:

However, it may be that you want -999 to represent a "lack of value" (which would certainly be the case where there were no values in a group). If that's the case, you can just ouput -999 rather than count in your original code:

awk '!/\-999/{sm+=$1;ct++} NR%6==0{print ct?(sm/ct):-999;sm=ct=0}' input.txt

即使您认为零 是一个更好的答案,我还是要明确指出,而不是输出count变量本身:

Even if you decide that zero is a better answer, I'd still make that explicit rather than outputting the count variable itself:

awk '!/\-999/{sm+=$1;ct++} NR%6==0{print ct?(sm/ct):0;sm=ct=0}' input.txt

这篇关于在不考虑Shell脚本中缺少值的情况下计算平均值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆