更好地在 pandas 中进行分箱 [英] Better binning in pandas
本文介绍了更好地在 pandas 中进行分箱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框,想要按一定范围的值进行过滤或合并,然后获取每个合并中的值计数.
I've got a data frame and want to filter or bin by a range of values and then get the counts of values in each bin.
当前,我正在这样做:
x = 5
y = 17
z = 33
filter_values = [x, y, z]
filtered_a = df[df.filtercol <= x]
a_count = filtered_a.filtercol.count()
filtered_b = df[df.filtercol > x]
filtered_b = filtered_b[filtered_b <= y]
b_count = filtered_b.filtercol.count()
filtered_c = df[df.filtercol > y]
c_count = filtered_c.filtercol.count()
但是有没有更简洁的方法来完成相同的事情?
But is there a more concise way to accomplish the same thing?
推荐答案
也许您正在寻找收益
(17, 33] 16
(5, 17] 12
(0, 5] 5
要对结果重新排序以使bin范围按顺序显示,您可以使用
To reorder the result so the bin ranges appear in order, you could use
counts.sort_index()
产生
(0, 5] 5
(5, 17] 12
(17, 33] 16
Thanks to nivniv and InLaw for this improvement.
另请参见离散化和量化
这篇关于更好地在 pandas 中进行分箱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文