使用正常数据直方图与直接公式(matlab)进行熵估计 [英] entropy estimation using histogram of normal data vs direct formula (matlab)
问题描述
假设我们已经绘制了标准正态分布的n=10000
个样本.
Let's assume we have drawn n=10000
samples of the standard normal distribution.
现在我想使用直方图来计算其熵来计算概率.
Now I want to calculate its entropy using histograms to calculate the probabilities.
1)计算概率(例如,使用matlab)
1) calculate probabilities (for example using matlab)
[p,x] = hist(samples,binnumbers);
area = (x(2)-x(1))*sum(p);
p = p/area;
(binnumbers是根据某些规则确定的)
(binnumbers is determined due to some rule)
2)估计熵
H = -sum(p.*log2(p))
给出58.6488
现在,当我使用直接公式计算正常数据的熵时
Now when i use the direct formula to calculate the entropy of normal data
H = 0.5*log2(2*pi*exp(1)) = 2.0471
使用直方图+熵公式时,我该怎么做? 非常感谢您的帮助!
What do i do wrong when using the histograms + entropy formula? Thank you very much for any help!!
推荐答案
您缺少总和中的dp
项
dp = (x(2)-x(1));
area = sum(p)*dp;
H = -sum( (p*dp) * log2(p) );
这应该使您足够接近...
This should bring you close enough...
PS,
拿log2(p)
时要小心,因为有时您可能有空的垃圾箱.您可能会发现 nansum
有用.
PS,
be careful when you take log2(p)
for sometimes you might have empty bins. You might find nansum
useful.
这篇关于使用正常数据直方图与直接公式(matlab)进行熵估计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!