bnlearn :: bn.fit差异和方法"mle"的计算和“贝叶斯" [英] bnlearn::bn.fit difference and calculation of methods "mle" and "bayes"

查看:302
本文介绍了bnlearn :: bn.fit差异和方法"mle"的计算和“贝叶斯"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试了解软件包bnlearnbn.fit函数中的bayesmle两种方法之间的区别.

I try to understand the differences between the two methods bayes and mle in the bn.fit function of the package bnlearn.

我知道常客和贝叶斯方法在理解概率之间的争论.从理论上讲,我认为最大似然估计mle是将相对频率设置为概率的简单的频频方法.但是,要进行什么计算才能得到bayes估算值?我已经签出了 bnlearn文档,即 bn.fit函数的说明和一些

I know about the debate between the frequentist and the bayesian approach on understanding probabilities. On a theoretical level I suppose the maximum likelihood estimate mle is a simple frequentist approach setting the relative frequencies as the probability. But what calculations are done to get the bayes estimate? I already checked out the bnlearn documenation, the description of the bn.fit function and some application examples, but nowhere there's a real description of what's happening.

我还试图通过先签出bnlearn::bn.fit,导致bnlearn:::bn.fit.backend,导致bnlearn:::smartSapply来理解R中的功能,但是后来我被卡住了.

I also tried to understand the function in R by first checking out bnlearn::bn.fit, leading to bnlearn:::bn.fit.backend, leading to bnlearn:::smartSapply but then I got stuck.

当我使用该软件包进行学术工作时,将获得一些帮助,因此,我应该能够解释会发生什么.

Some help would be really appreciated as I use the package for academic work and therefore I should be able to explain what happens.

推荐答案

bnlearn::bn.fit中的贝叶斯参数估计适用于离散变量.关键是可选的iss参数:贝叶斯方法用来估计与离散节点关联的条件概率表(CPT)的虚拟样本大小".

Bayesian parameter estimation in bnlearn::bn.fit applies to discrete variables. The key is the optional iss argument: "the imaginary sample size used by the bayes method to estimate the conditional probability tables (CPTs) associated with discrete nodes".

因此,对于某些网络中的二进制根节点Xbnlearn::bn.fit中的bayes选项返回(Nx + iss / cptsize) / (N + iss)作为X = x的概率,其中N是样本数量,带有X = xcptsize的CPT大小为X的样本数;在这种情况下,为cptsize = 2.相关代码在bnlearn:::bn.fit.backend.discrete函数中,尤其是以下行:tab = tab + extra.args$iss/prod(dim(tab))

So, for a binary root node X in some network, the bayes option in bnlearn::bn.fit returns (Nx + iss / cptsize) / (N + iss) as the probability of X = x, where N is your number of samples, Nx the number of samples with X = x, and cptsize the size of the CPT of X; in this case cptsize = 2. The relevant code is in the bnlearn:::bn.fit.backend.discrete function, in particular the line: tab = tab + extra.args$iss/prod(dim(tab))

因此,iss / cptsize是CPT中每个条目的假想观察数,而N是真实"观察数.使用iss = 0,您将获得最大似然估计,因为您将没有先前的假想观察.

Thus, iss / cptsize is the number of imaginary observations for each entry in a CPT, as opposed to N, the number of 'real' observations. With iss = 0 you would be getting a maximum likelihood estimate, as you would have no prior imaginary observations.

相对于Niss越高,先验对后验参数估计的影响越强.随着固定的iss和增长的N,贝叶斯估计量和最大似然估计量收敛到相同的值.

The higher iss with respect to N, the stronger the effect of the prior on your posterior parameter estimates. With a fixed iss and a growing N, the Bayesian estimator and the maximum likelihood estimator converge to the same value.

通常的经验法则是使用小的非零iss,以便避免在CPT中输入零,这与在数据中未观察到的组合相对应.这样的零条目可能会导致网络推广不佳,例如探路者系统.

A common rule of thumb is to use a small non-zero iss so that you avoid zero entries in the CPTs, corresponding to combinations that were not observed in the data. Such zero entries could then result in a network which generalizes poorly, such as some early versions of the Pathfinder system.

有关贝叶斯参数估计的更多详细信息,您可以阅读 Koller和弗里德曼.我想许多其他的贝叶斯网络书籍也涵盖了这个话题.

For more details on Bayesian parameter estimation you can have a look at the book by Koller and Friedman. I suppose many other Bayesian network books also cover the topic.

这篇关于bnlearn :: bn.fit差异和方法"mle"的计算和“贝叶斯"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆