装箱 [英] Binning in Numpy
问题描述
我有一个数组A,我试图将其放入10个垃圾箱中.这就是我所做的.
I have an array A which I am trying to put into 10 bins. Here is what I've done.
A = range(1,94)
hist = np.histogram(A, bins=10)
np.digitize(A, hist[1])
但是输出中有11个容器,而不是10个,最后一个值(93)放置在容器11中,而应该放在容器10中.我可以用hack修复它,但是最优雅的方法是这?如何分辨hist [1]中的最后一个bin在右边包含数字-[]而不是[]?
But the output has 11 bins, not 10, with the last value (93) placed in bin 11, when it should have been in bin 10. I can fix it with a hack, but what's the most elegant way of doing this? How do I tell digitize that the last bin in hist[1] is inclusive on the right - [ ] instead of [ )?
推荐答案
np.histogram
实际上有10个垃圾箱;最后一个(最右边的)箱包含最大元素,因为它的右边缘是包容性的(与其他箱不同).
The output of np.histogram
actually has 10 bins; the last (right-most) bin includes the greatest element because its right edge is inclusive (unlike for other bins).
np.digitize
方法不会发生此类异常(因为其用途有所不同),因此列表中最大的元素会放入一个额外的bin中.要获得与histogram
一致的bin分配,只需使用fmin
将digitize
的输出限制为bin的数量.
The np.digitize
method doesn't make such an exception (since its purpose is different) so the largest element(s) of the list get placed into an extra bin. To get the bin assignments that are consistent with histogram
, just clamp the output of digitize
by the number of bins, using fmin
.
A = range(1,94)
bin_count = 10
hist = np.histogram(A, bins=bin_count)
np.fmin(np.digitize(A, hist[1]), bin_count)
输出:
array([ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2,
2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4,
4, 4, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8,
8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10,
10, 10, 10, 10, 10, 10, 10, 10])
这篇关于装箱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!