AttributeError: getfeature_names 未找到;使用 scikit-learn [英] AttributeError: getfeature_names not found ; using scikit-learn

查看:69
本文介绍了AttributeError: getfeature_names 未找到;使用 scikit-learn的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer()
vectorizer = vectorizer.fit(word_data)
freq_term_mat = vectorizer.transform(word_data)

from sklearn.feature_extraction.text import TfidfTransformer

tfidf = TfidfTransformer(norm="l2")
tfidf = tfidf.fit(freq_term_mat)
Ttf_idf_matrix = tfidf.transform(freq_term_mat)

voc_words = Ttf_idf_matrix.getfeature_names()
print "The num of words = ",len(voc_words)

当我运行包含这段代码的程序时,出现以下错误:

when I run the program containing this piece of code I get following error:

回溯(最近一次调用最后一次):文件vectorize_text.py",第 87 行,在
voc_words = Ttf_idf_matrix.getfeature_names()
getattr
中的文件/home/farheen/anaconda/lib/python2.7/site->packages/scipy/sparse/base.py",第 499 行引发 AttributeError(attr + " not found")
AttributeError: get_feature_names 未找到

Traceback (most recent call last): File "vectorize_text.py", line 87, in
voc_words = Ttf_idf_matrix.getfeature_names()
File "/home/farheen/anaconda/lib/python2.7/site- >packages/scipy/sparse/base.py", line 499, in getattr
raise AttributeError(attr + " not found")
AttributeError: get_feature_names not found

请给我建议一个解决方案.

Please suggest me a solution for it.

推荐答案

我发现您的代码有两个问题.首先,您将 get_feature_names() 应用于矩阵输出,而不是矢量化器.您需要将其应用于矢量化器.其次,您不必要地将其分解为太多步骤.您可以使用 TfidfVectorizer.fit_transform() 在更少的空间内做您想做的事.试试这个:

I see two problems with your code. First, you are applying get_feature_names() to your matrix output, rather than to the vectorizer. You need to apply it to the vectorizer. Second, you are unnecessarily breaking this apart into too many steps. You can use TfidfVectorizer.fit_transform() to do what you want in much less space. Try this:

from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer()
transformed = vectorizer.fit_transform(word_data)
print "Num words:", len(vectorizer.get_feature_names())

这篇关于AttributeError: getfeature_names 未找到;使用 scikit-learn的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆