如何使用朴素贝叶斯实现TF_IDF功能加权 [英] How to implement TF_IDF feature weighting with Naive Bayes

查看:117
本文介绍了如何使用朴素贝叶斯实现TF_IDF功能加权的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试实施朴素的贝叶斯分类器进行情感分析.我计划使用TF-IDF加权度量.我现在有点卡住了. NB通常使用单词(特征)频率来找到最大似然.那么如何在朴素贝叶斯中引入TF-IDF加权度量?

I'm trying to implement the naive Bayes classifier for sentiment analysis. I plan to use the TF-IDF weighting measure. I'm just a little stuck now. NB generally uses the word(feature) frequency to find the maximum likelihood. So how do I introduce the TF-IDF weighting measure in naive Bayes?

推荐答案

您将TF-IDF权重用作统计模型中的特征/预测变量.我建议使用gensim [1]或scikit-learn [2]来计算权重,然后将其传递给您的朴素贝叶斯拟合过程.

You use the TF-IDF weights as features/predictors in your statistical model. I suggest to use either gensim [1]or scikit-learn [2] to compute the weights, which you then pass to your Naive Bayes fitting procedure.

可能还会对scikit-learn的使用文本"教程[3]感兴趣.

The scikit-learn 'working with text' tutorial [3] might also be of interest.

[1] http://scikit-learn .org/dev/modules/generation/sklearn.feature_extraction.text.TfidfTransformer.html

[2] http://radimrehurek.com/gensim/models/tfidfmodel.html

[3] http://scikit-learn.github.io/scikit-learn-tutorial/working_with_text_data.html

这篇关于如何使用朴素贝叶斯实现TF_IDF功能加权的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆