创建类似awk直方图的垃圾箱 [英] Create bins with awk histogram-like

查看:64
本文介绍了创建类似awk直方图的垃圾箱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的输入文件:

1.37987
1.21448
0.624999
1.28966
1.77084
1.088
1.41667

我想创建自己选择的大小的bin,以获得类似直方图的输出,例如从0开始从0.1开始这样的东西:

I would like to create bins of a size of my choice to get histogram-like output, e.g. something like this for 0.1 bins, starting from 0 :

0 0.1 0
...
0.5 0.6 0
0.6 0.7 1
...
1.0 1.1 1
1.1 1.2 0
1.2 1.3 2
1.3 1.4 1
...

我的文件对于R来说太大了,所以我正在寻找awk解决方案(因为我仍然是Linux初学者,因此也可以打开我能理解的其他任何文件).

My file is too big for R, so I'm looking for an awk solution (also open to anything else that I can understand, as I'm still a Linux beginner).

这已经在这篇文章中得到了解答:存储桶中的awk直方图,但是解决方案对我不起作用.

This was sort of already answered in this post : awk histogram in buckets but the solution is not working for me.

推荐答案

这也是可能的:

awk -v size=0.1 
  '{ b=int($1/size); a[b]++; bmax=b>bmax?b:bmax; bmin=b<bmin?b:bmin }
   END { for(i=bmin;i<=bmax;++i) print i*size,(i+1)*size,a[i] }' <file>

它本质上与EdMorton的解决方案相同,但是从最小值(默认值 0 )开始打印存储桶.它实际上考虑了负数.

It essentially does the same as the solution of EdMorton, but starts printing buckets from the minimum value which is default 0. It essentially takes negative numbers into account.

这篇关于创建类似awk直方图的垃圾箱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆