为什么以下 tfidf 矢量化失败? [英] Why the following tfidf vectorization is failing?

查看:63
本文介绍了为什么以下 tfidf 矢量化失败?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,我正在进行以下实验,首先我创建了一个名为:tfidf:

Hello I am making the following experiment, first I created a vectorizer called: tfidf:

tfidf_vectorizer = TfidfVectorizer(min_df=10,ngram_range=(1,3),analyzer='word',max_features=500)

然后我矢量化了以下列表:

Then I vectorized the following list:

tfidf = tfidf_vectorizer.fit_transform(listComments)

我的评论列表如下:

listComments = ["hello this is a test","the car is red",...]

我尝试按如下方式保存模型:

I tried to save the model as follows:

#Saving tfidf
with open('vectorizerTFIDF.pickle','wb') as idxf:
    pickle.dump(tfidf, idxf, pickle.HIGHEST_PROTOCOL)

我想使用我的矢量化器将相同的 tfidf 应用到以下列表:

I would like to use my vectorizer to apply the same tfidf to the following list:

lastComment = ["this is a car"]

开放模式:

with open('vectorizerTFIDF.pickle', 'rb') as infile:
    tdf = pickle.load(infile)

vector = tdf.transform(lastComment)

但是我得到:

Traceback (most recent call last):
  File "C:/Users/LDA_test/ldaTest.py", line 141, in <module>
    vector = tdf.transform(lastComment)
  File "C:\Program Files\Anaconda3\lib\site-packages\scipy\sparse\base.py", line 559, in __getattr__
    raise AttributeError(attr + " not found")
AttributeError: transform not found

我希望有人能在这个问题上支持我,在此先感谢,

I hope someone could support me with this issue thanks in advance,

推荐答案

您已经腌制了矢量化数组,而不是转换器,您需要 pickle.dump(tfidf_vectorizer, idxf, pickle.HIGHEST_PROTOCOL)

You've pickled the vectorized array, not the transformer, you need pickle.dump(tfidf_vectorizer, idxf, pickle.HIGHEST_PROTOCOL)

这篇关于为什么以下 tfidf 矢量化失败?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆