NLTK是否已实施TF-IDF? [英] Does NLTK have TF-IDF implemented?
本文介绍了NLTK是否已实施TF-IDF?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
在scikit-learn
和gensim
中有TF-IDF实现.
There are TF-IDF implementations in scikit-learn
and gensim
.
有一些简单的实现为避免重新发明轮子,
- NLTK中确实没有TF-IDF吗?
- 是否存在可用于在NLTK中实现TF-IDF的子包?如果有的话?
- Is there really no TF-IDF in NLTK?
- Are there sub-packages that we can manipulate to implement TF-IDF in NLTK? If there are how?
在此博客中,NLTK没有它. 是真的吗? http://www.bogotobogo .com/python/NLTK/tf_idf_with_scikit-learn_NLTK.php
In this blogpost, it says NLTK doesn't have it. Is that true? http://www.bogotobogo.com/python/NLTK/tf_idf_with_scikit-learn_NLTK.php
推荐答案
NLTK TextCollection类具有一种用于计算术语tf-idf的方法.该文档位于此处,来源是
The NLTK TextCollection class has a method for computing the tf-idf of terms. The documentation is here, and the source is here. However, it says "may be slow to load", so using scikit-learn may be preferable.
这篇关于NLTK是否已实施TF-IDF?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文