NLTK和scikit-learn中的Bernoulli Naive Bayes之间有不同的结果 [英] Different results between the Bernoulli Naive Bayes in NLTK and in scikit-learn

查看:138
本文介绍了NLTK和scikit-learn中的Bernoulli Naive Bayes之间有不同的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用NLTK中的Bernoulli Naive Bayes算法和scikit-learn模块中的一种对文本(仅分为两类)进行分类时,我得到了截然不同的结果.尽管两者之间的总体精度是可比的(尽管相差甚远),但I型和II型错误的差异却很明显.特别是,NLTK朴素贝叶斯分类器给出的Type I错误要多于Type II错误,而scikit-learn则相反.这种异常"似乎在不同功能和不同训练样本之间是一致的.是否有一个原因 ?两者中哪个更值得信赖?

I am getting quite different results when classifying text (in only two categories) with the Bernoulli Naive Bayes algorithm in NLTK and the one in scikit-learn module. Although the overall accuracy is comparable between the two (although far from identical) the difference in Type I and Type II errors is significant. In particular, the NLTK Naive Bayes classifier would give more Type I than Type II errors , while the scikit-learn -- the opposite. This 'anomaly' seem to be consistent across different features and different training samples. Is there a reason for this ? Which of the two is more trustworthy?

推荐答案

NLTK未实现Bernoulli Naive Bayes.它实现多项式朴素贝叶斯(Naive Bayes),但仅允许二进制功能.

NLTK does not implement Bernoulli Naive Bayes. It implements multinomial Naive Bayes but only allows binary features.

这篇关于NLTK和scikit-learn中的Bernoulli Naive Bayes之间有不同的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆