numpy直方图累积密度不等于1 [英] numpy histogram cumulative density does not sum to 1

查看:231
本文介绍了numpy直方图累积密度不等于1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从另一个线程获取提示( @EnricoGiampieri的答案累积分布图python ),我写道:

Taking a tip from another thread (@EnricoGiampieri's answer to cumulative distribution plots python), I wrote:

# plot cumulative density function of nearest nbr distances
# evaluate the histogram
values, base = np.histogram(nearest, bins=20, density=1)
#evaluate the cumulative
cumulative = np.cumsum(values)
# plot the cumulative function
plt.plot(base[:-1], cumulative, label='data')

我从np.histogram的文档中输入了density = 1,它表示:

I put in the density=1 from the documentation on np.histogram, which says:

请注意,除非选择了单位宽度的单元格,否则直方图值的总和将不等于1;这不是概率质量函数."

"Note that the sum of the histogram values will not be equal to 1 unless bins of unity width are chosen; it is not a probability mass function. "

实际上,当绘制它们时,它们的总和不等于1.但是,我不理解单位宽度的箱".当我将垃圾箱设置为1时,我得到的是一个空图表.当我将它们设置为人口规模时,我的总和不等于1(更像是0.2).当我使用建议的40个垃圾箱时,它们的总和约为.006.

Well, indeed, when plotted, they don't sum to 1. But, I do not understand the "bins of unity width." When I set the bins to 1, of course, I get an empty chart; when I set them to the population size, I don't get a sum to 1 (more like 0.2). When I use the 40 bins suggested, they sum to about .006.

有人可以给我一些指导吗?谢谢!

Can anybody give me some guidance? Thanks!

推荐答案

您需要确保垃圾箱的宽度均为1.即:

You need to make sure your bins are all width 1. That is:

np.all(np.diff(base)==1)

要实现此目的,您必须手动指定垃圾箱:

To achieve this, you have to manually specify your bins:

bins = np.arange(np.floor(nearest.min()),np.ceil(nearest.max()))
values, base = np.histogram(nearest, bins=bins, density=1)

您将得到:

In [18]: np.all(np.diff(base)==1)
Out[18]: True

In [19]: np.sum(values)
Out[19]: 0.99999999999999989

这篇关于numpy直方图累积密度不等于1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆