使用matlab函数"pdf"来表示 [英] Using matlab function "pdf"

查看:420
本文介绍了使用matlab函数"pdf"来表示的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个64维的高斯混合分布对象obj,想将其放在pdf函数中以找出某个点的概率.

但是,当我键入pdf(obj,obj.mu(1,:))来测试对象时,它会产生很高的概率(例如2.4845e + 069)

这没有意义,因为概率应该在零到一之间.

我的matlab有问题吗?

p.s. 甚至pdf(obj,obj.mu(1,:)+obj.Sigma(1,1)*rand())的概率也很高(2.1682e + 069)

解决方案

首先,第一件事是:概率密度函数并不总是求值为1,而只是在其域上积分为1. >

此外,您看到的是奇点问题(请参见第434页,图9.7).某些组件崩溃到单个数据点上不可避免地导致方差变为0,PDF爆炸.这在高斯混合模型中经常遇到,因为它不是对数凸的,并且似然函数中有很多局部最大值.我们尝试找到表现良好的行为良好的局部最大值,而奇异点则是特别糟糕的情况.

看到此消息后,您将需要以不同的起点重新运行算法或减少所使用的组件数量.上面的书还建议仅将特定组件重置为其他值.

另一种方法是通过对参数采用先验或正则项来使用贝叶斯方法,这将惩罚诸如0 sigma参数之类的过时值.

您可以在gmdistribution.fit中使用不同的起始值间接控制第一部分.对于第二部分,可以使用Regularize参数: http://www .mathworks.com/help/stats/gmdistribution.fit.html

I have got a Gaussian mixture distribution object obj of 64 dimensions and would like to put it in the pdf function to find out the probability of certain point.

Yet when I type pdf(obj,obj.mu(1,:)) to test the object it yield a very high probability (like 2.4845e+069)

And it does not make sense, cause probability should lies between zero and one.

Is my matlab having any problem?

p.s. even pdf(obj,obj.mu(1,:)+obj.Sigma(1,1)*rand()) yield a high probability (2.1682e+069)

解决方案

First things first: a probability density function does not always evaluate to 1, it merely integrates to 1 over its domain.

Moreover, what you are seeing is the problem of singularities (see page 434, figure 9.7) when fitting a gaussian mixture model. Some component collapsing onto a single data point inevitably causes the variance to go to 0 and the PDF to explode. This is often encountered in gaussian mixture models because it is not log-convex and there are lots of local maxima in the likelihood function. We try to find a well-behaved local maximum that performs well, and the singularities are particularly bad cases.

When you see this, you will want to rerun the algorithm with different starting points or to reduce the number of components you are using. The book above also recommends just resetting the particular component to a different value.

Another approach would be to use a Bayesian approach by adopting a prior or regularization term for your parameters, which will penalize outlandish values such as 0 sigma parameters.

You can indirectly control the first part using different starting values in gmdistribution.fit. For the second part, you can use the Regularize argument: http://www.mathworks.com/help/stats/gmdistribution.fit.html

这篇关于使用matlab函数"pdf"来表示的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆