使用matlab函数"pdf"来表示 [英] Using matlab function "pdf"
问题描述
我有一个64维的高斯混合分布对象obj
,想将其放在pdf
函数中以找出某个点的概率.
但是,当我键入pdf(obj,obj.mu(1,:))
来测试对象时,它会产生很高的概率(例如2.4845e + 069)
这没有意义,因为概率应该在零到一之间.
我的matlab有问题吗?
p.s.
甚至pdf(obj,obj.mu(1,:)+obj.Sigma(1,1)*rand())
的概率也很高(2.1682e + 069)
首先,第一件事是:概率密度函数并不总是求值为1,而只是在其域上积分为1. >
看到此消息后,您将需要以不同的起点重新运行算法或减少所使用的组件数量.上面的书还建议仅将特定组件重置为其他值.
另一种方法是通过对参数采用先验或正则项来使用贝叶斯方法,这将惩罚诸如0 sigma参数之类的过时值.
您可以在gmdistribution.fit
中使用不同的起始值间接控制第一部分.对于第二部分,可以使用Regularize
参数: http://www .mathworks.com/help/stats/gmdistribution.fit.html
I have got a Gaussian mixture distribution object obj
of 64 dimensions and would like to put it in the pdf
function to find out the probability of certain point.
Yet when I type pdf(obj,obj.mu(1,:))
to test the object it yield a very high probability (like 2.4845e+069)
And it does not make sense, cause probability should lies between zero and one.
Is my matlab having any problem?
p.s.
even pdf(obj,obj.mu(1,:)+obj.Sigma(1,1)*rand())
yield a high probability (2.1682e+069)
First things first: a probability density function does not always evaluate to 1, it merely integrates to 1 over its domain.
Moreover, what you are seeing is the problem of singularities (see page 434, figure 9.7) when fitting a gaussian mixture model. Some component collapsing onto a single data point inevitably causes the variance to go to 0 and the PDF to explode. This is often encountered in gaussian mixture models because it is not log-convex and there are lots of local maxima in the likelihood function. We try to find a well-behaved local maximum that performs well, and the singularities are particularly bad cases.
When you see this, you will want to rerun the algorithm with different starting points or to reduce the number of components you are using. The book above also recommends just resetting the particular component to a different value.
Another approach would be to use a Bayesian approach by adopting a prior or regularization term for your parameters, which will penalize outlandish values such as 0 sigma parameters.
You can indirectly control the first part using different starting values in gmdistribution.fit
. For the second part, you can use the Regularize
argument: http://www.mathworks.com/help/stats/gmdistribution.fit.html
这篇关于使用matlab函数"pdf"来表示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!