pandas groupby如何计算范围内的计数 [英] Pandas groupby how to compute counts in ranges

查看:88
本文介绍了 pandas groupby如何计算范围内的计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我有一个很大的数字列表,介于0到100之间. 我根据最大数量计算范围,然后说有10个垃圾箱. 所以我的范围例如

Say I have a huge list of numbers between 0 and 100. I compute ranges, depending on the max number and then saying there are 10 bins. So my ranges are for example

ranges = [0,10,20,30,40,50,60,70,80,90,100]

现在,我对0-10、10-20等各个范围内的出现次数进行计数. 我遍历列表中的每个数字并检查范围. 我认为这不是运行时速度的最佳方法.

Now I count the occurances in each range from 0-10, 10-20, and so on. I iterate over every number in the list and check for a range. I assume this is not the best way in terms of runtime speed.

我可以用大熊猫把它系起来吗,例如pandas.groupby,如何?

Can I fasten it up by using pandas, e.g. pandas.groupby, and how?

推荐答案

我们可以使用

We can use pd.cut to bin the values into ranges, then we can groupby these ranges, and finally call count to count the values now binned into these ranges:

In [82]:

df = pd.DataFrame({"a": np.random.random_integers(0, high=100, size=100)})
ranges = [0,10,20,30,40,50,60,70,80,90,100]
df.groupby(pd.cut(df.a, ranges)).count()
Out[82]:
            a
a            
(0, 10]    10
(10, 20]    6
(20, 30]   12
(30, 40]    9
(40, 50]   11
(50, 60]   12
(60, 70]    9
(70, 80]   13
(80, 90]    9
(90, 100]   9

这篇关于 pandas groupby如何计算范围内的计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆