每列n个数据的awk平均值 [英] Awk average of n data in each column

查看:391
本文介绍了每列n个数据的awk平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

"使用awk将列表中的值分类的数字"提供了一种使用awk对列中每组3个点进行平均的解决方案.

"Using awk to bin values in a list of numbers" provide a solution to average each set of 3 points in a column using awk.

如何将其扩展为无限数量的保留格式的列?例如:

How is it possible to extend it to an indefinite number of columns mantaining the format? For example:

2457135.564106 13.249116 13.140903 0.003615 0.003440
2457135.564604 13.250833 13.139971 0.003619 0.003438
2457135.565067 13.247932 13.135975 0.003614 0.003432
2457135.565576 13.256441 13.146996 0.003628 0.003449
2457135.566039 13.266003 13.159108 0.003644 0.003469
2457135.566514 13.271724 13.163555 0.003654 0.003476
2457135.567011 13.276248 13.166179 0.003661 0.003480
2457135.567474 13.274198 13.165396 0.003658 0.003479
2457135.567983 13.267855 13.156620 0.003647 0.003465
2457135.568446 13.263761 13.152515 0.003640 0.003458

每5行取平均值,应该输出类似

averaging values every 5 lines, should output something like

2457135.564916  13.253240   13.143976   0.003622    0.003444
2457135.567324  13.270918   13.161303   0.003652    0.003472

其中第一个结果是前1-5行的平均值,第二个结果是6-10行的平均值.

where the first result is the average of the first 1-5 lines, and the second result is the average of the 6-10 lines.

推荐答案

The accepted answer to Using awk to bin values in a list of numbers is:

awk '{sum+=$1} NR%3==0 {print sum/3; sum=0}' inFile

平均所有列的明显扩展是:

The obvious extension to average all the columns is:

awk 'BEGIN { N = 3 }
     { for (i = 1; i <= NF; i++) sum[i] += $i }
     NR % N == 0 { for (i = 1; i <= NF; i++)
                   {
                       printf("%.6f%s", sum[i]/N, (i == NF) ? "\n" : " ")
                       sum[i] = 0
                   }
                 }' inFile

这里的额外灵活性是,如果您希望将5行的块分组,则只需将3出现的次数更改为5.这将忽略文件末尾最多N-1行的块.如果需要,可以添加一个END块,如果NR%N!= 0,则该块将打印适当的平均值.

The extra flexibility here is that if you want to group blocks of 5 rows, you simply change one occurrence of 3 into 5. This ignores blocks of up to N-1 rows at the end of the file. If you want to, you can add an END block that prints a suitable average if NR % N != 0.

对于样本输入数据,我从上面的脚本获得的输出是:

For the sample input data, the output I got from the script above was:

2457135.564592 13.249294 13.138950 0.003616 0.003437
2457135.566043 13.264723 13.156553 0.003642 0.003465
2457135.567489 13.272767 13.162732 0.003655 0.003475

如果要分析输出格式应该是什么,可以使代码复杂得多.我只是使用%.6f来确保6位小数.

You can make the code much more complex if you want to analyze what the output formats should be. I've simply used %.6f to ensure 6 decimal places.

如果希望N作为命令行参数,则可以使用-v选项将变量设置中继到awk:

If you want N to be a command-line parameter, you can use the -v option to relay the variable setting to awk:

awk -v N="${variable:-3}" \
    '{ for (i = 1; i <= NF; i++) sum[i] += $i }
     NR % N == 0 { for (i = 1; i <= NF; i++)
                   {
                       printf("%.6f%s", sum[i]/N, (i == NF) ? "\n" : " ")
                       sum[i] = 0
                   }
                 }' inFile

在将$variable设置为5的情况下调用时,从示例数据生成的输出为:

When invoked with $variable set to 5, the output generated from the sample data is:

2457135.565078 13.254065 13.144591 0.003624 0.003446
2457135.567486 13.270757 13.160853 0.003652 0.003472

这篇关于每列n个数据的awk平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆