装箱 [英] Binning in Numpy

查看:59
本文介绍了装箱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数组A,我试图将其放入10个垃圾箱中.这就是我所做的.

I have an array A which I am trying to put into 10 bins. Here is what I've done.

A = range(1,94)
hist = np.histogram(A, bins=10)
np.digitize(A, hist[1])

但是输出中有11个容器,而不是10个,最后一个值(93)放置在容器11中,而应该放在容器10中.我可以用hack修复它,但是最优雅的方法是这?如何分辨hist [1]中的最后一个bin在右边包含数字-[]而不是[]?

But the output has 11 bins, not 10, with the last value (93) placed in bin 11, when it should have been in bin 10. I can fix it with a hack, but what's the most elegant way of doing this? How do I tell digitize that the last bin in hist[1] is inclusive on the right - [ ] instead of [ )?

推荐答案

np.histogram 实际上有10个垃圾箱;最后一个(最右边的)箱包含最大元素,因为它的右边缘是包容性的(与其他箱不同).

The output of np.histogram actually has 10 bins; the last (right-most) bin includes the greatest element because its right edge is inclusive (unlike for other bins).

np.digitize方法不会发生此类异常(因为其用途有所不同),因此列表中最大的元素会放入一个额外的bin中.要获得与histogram一致的bin分配,只需使用fmindigitize的输出限制为bin的数量.

The np.digitize method doesn't make such an exception (since its purpose is different) so the largest element(s) of the list get placed into an extra bin. To get the bin assignments that are consistent with histogram, just clamp the output of digitize by the number of bins, using fmin.

A = range(1,94)
bin_count = 10
hist = np.histogram(A, bins=bin_count)
np.fmin(np.digitize(A, hist[1]), bin_count)

输出:

array([ 1,  1,  1,  1,  1,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2,  2,  2,
        2,  2,  3,  3,  3,  3,  3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  4,
        4,  4,  4,  5,  5,  5,  5,  5,  5,  5,  5,  5,  6,  6,  6,  6,  6,
        6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  7,  7,  7,  7,  8,  8,  8,
        8,  8,  8,  8,  8,  8,  9,  9,  9,  9,  9,  9,  9,  9,  9, 10, 10,
       10, 10, 10, 10, 10, 10, 10, 10])

这篇关于装箱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆