R中的多项式朴素贝叶斯分类器 [英] Multinomial Naive Bayes classifier in R

查看:69
本文介绍了R中的多项式朴素贝叶斯分类器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在重新提问(同名)多项式朴素贝叶斯分类器.该问题似乎已经接受了一个我认为是错误的答案,或者我想进一步解释,因为我仍然不明白.

I am re-asking the question (with the same name) Multinomial Naive Bayes Classifier. That question seems to have accepted an answer which I think is either wrong or I'd like more explanation because I still don't understand.

到目前为止,我在R中看到的每个朴素贝叶斯分类器(包括bnlearnklaR) 的实现假设特征具有高斯似然性.

So far, every Naive Bayes classifier that I've seen in R (including bnlearn and klaR) have implementations that assume that the features have gaussian likelihoods.

R中是否存在使用多项似然性的朴素贝叶斯分类器的实现(类似于 scikit-learn的MultinomialNB )?

Is there an implementation of a Naive Bayes classifier in R that uses multinomial likelihoods (akin to scikit-learn's MultinomialNB)?

特别是-如果事实证明在这两个模块中的任何一个中都有某种调用 naive.bayes 的方式,因此可以通过多项式分布来估计可能性-我真的很高兴看到怎么做的.我已经搜索了示例,但没有找到任何示例.例如:这是 klaR.NaiveBayes 中的 usekernal 自变量是什么?

In particular -- if it turns out there is some way of calling naive.bayes in either of these modules so the likelihoods are estimated with a multinomial distribution -- I would really appreciate an example of how that's done. I've searched for examples and haven't found any. For example: is this what the usekernal argument is for in klaR.NaiveBayes?

推荐答案

我不知道 preive 方法在 naive.bayes 模型上调用哪种算法,但是您可以从条件概率表(最大估计)自己计算预测

I don't know what algorithm the predict method call on naive.bayes models but you can calculate the predictions yourself from the conditional probability tables (mle estimates)

# You may need to get dependencies of gRain from here
#   source("http://bioconductor.org/biocLite.R")
#   biocLite("RBGL")

    library(bnlearn)
    library(gRain)

使用 naive.bayes 帮助页面上的第一个示例

Using the first example from naive.bayes help page

    data(learning.test)

    # fit model
    bn <- naive.bayes(learning.test, "A")   

    # look at cpt's
    fit <- bn.fit(bn, learning.test)    

    # check that the cpt's (proportions) are the mle of the multinomial dist.
    # Node A:
    all.equal(prop.table(table(learning.test$A)), fit$A$prob)
    # Node B:
    all.equal(prop.table(table(learning.test$B, learning.test$A),2), fit$B$prob)


    # look at predictions - include probabilities 
    pred <- predict(bn, learning.test, prob=TRUE)
    pr <- data.frame(t(attributes(pred)$prob))
    pr <- cbind(pred, pr)

    head(pr, 2)

#   preds          a          b          c
# 1     c 0.29990442 0.33609392 0.36400165
# 2     a 0.80321241 0.17406706 0.02272053

通过运行查询来计算cpt的预测概率-使用'gRain'

Calculate prediction probabilities from cpt's by running queries - using 'gRain'

    # query using junction tree- algorithm
    jj <- compile(as.grain(fit))

    # Get ptredicted probs for first observation
    net1 <- setEvidence(jj, nodes=c("B", "C", "D", "E", "F"), 
                                         states=c("c", "b", "a", "b", "b"))

    querygrain(net1, nodes="A", type="marginal")

# $A
# A
#        a         b         c 
# 0.3001765 0.3368022 0.3630213 

    # Get ptredicted probs for secondobservation
    net2 <- setEvidence(jj, nodes=c("B", "C", "D", "E", "F"), 
                                         states=c("a", "c", "a", "b", "b"))

    querygrain(net2, nodes="A", type="marginal")

# $A
# A
#         a          b          c 
# 0.80311043 0.17425364 0.02263593 

因此,这些概率与您从 bnlearn 获得的概率非常接近,并且是使用mle的概率计算的

So these probabilities are pretty close to what you get from bnlearn and are calculated using the mle's,

这篇关于R中的多项式朴素贝叶斯分类器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆