NLTK 是否实施了 TF-IDF? [英] Does NLTK have TF-IDF implemented?
问题描述
scikit-learn
和 gensim
中有 TF-IDF 实现.
There are TF-IDF implementations in scikit-learn
and gensim
.
有简单的实现 Python 中 N-Gram、tf-idf 和 Cosine 相似度的简单实现
为了避免重新发明轮子,
To avoid reinventing the wheel,
- NLTK 中真的没有 TF-IDF 吗?
- 是否有我们可以操纵的子包来在 NLTK 中实现 TF-IDF?如果有怎么办?
在这篇博文中,它说 NLTK 没有它.这是真的吗? http://www.bogotobogo.com/python/NLTK/tf_idf_with_scikit-learn_NLTK.php
In this blogpost, it says NLTK doesn't have it. Is that true? http://www.bogotobogo.com/python/NLTK/tf_idf_with_scikit-learn_NLTK.php
推荐答案
NLTK TextCollection 类具有计算术语的 tf-idf 的方法.文档在这里,来源是此处.但是,它说加载速度可能很慢",因此最好使用 scikit-learn.
The NLTK TextCollection class has a method for computing the tf-idf of terms. The documentation is here, and the source is here. However, it says "may be slow to load", so using scikit-learn may be preferable.
这篇关于NLTK 是否实施了 TF-IDF?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!