CountVectorizer: AttributeError: 'numpy.ndarray' 对象没有属性 'lower' [英] CountVectorizer: AttributeError: 'numpy.ndarray' object has no attribute 'lower'

查看：38 发布时间：2021/12/25 14:41:17 python numpy scikit-learn text-classification

本文介绍了CountVectorizer: AttributeError: 'numpy.ndarray' 对象没有属性 'lower'的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个一维数组，每个元素都有大字符串.我正在尝试使用 CountVectorizer 将文本数据转换为数值向量.但是，我收到一条错误消息:

I have a one-dimensional array with large strings in each of the elements. I am trying to use a CountVectorizer to convert text data into numerical vectors. However, I am getting an error saying:

AttributeError: 'numpy.ndarray' object has no attribute 'lower'

mealarray 在每个元素中都包含大字符串.有 5000 个这样的样本.我正在尝试将其矢量化，如下所示:

mealarray contains large strings in each of the elements. There are 5000 such samples. I am trying to vectorize this as given below:

vectorizer = CountVectorizer(
    stop_words='english',
    ngram_range=(1, 1),  #ngram_range=(1, 1) is the default
    dtype='double',
)
data = vectorizer.fit_transform(mealarray)

完整的堆栈跟踪:

File "/Library/Python/2.7/site-packages/sklearn/feature_extraction/text.py", line 817, in fit_transform
    self.fixed_vocabulary_)
  File "/Library/Python/2.7/site-packages/sklearn/feature_extraction/text.py", line 748, in _count_vocab
    for feature in analyze(doc):
  File "/Library/Python/2.7/site-packages/sklearn/feature_extraction/text.py", line 234, in <lambda>
    tokenize(preprocess(self.decode(doc))), stop_words)
  File "/Library/Python/2.7/site-packages/sklearn/feature_extraction/text.py", line 200, in <lambda>
    return lambda x: strip_accents(x.lower())
AttributeError: 'numpy.ndarray' object has no attribute 'lower'

推荐答案

检查 mealarray 的形状.如果 fit_transform 是一个字符串数组，必须是一维数组.(也就是说，mealarray.shape 的形式必须是 (n,).)例如，如果 mealarray 有一个形如 (n, 1).

Check the shape of mealarray. If the argument to fit_transform is an array of strings, it must be a one-dimensional array. (That is, mealarray.shape must be of the form (n,).) For example, you'll get the "no attribute" error if mealarray has a shape such as (n, 1).

你可以试试像

data = vectorizer.fit_transform(mealarray.ravel())

这篇关于CountVectorizer: AttributeError: 'numpy.ndarray' 对象没有属性 'lower'的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

CountVectorizer: AttributeError: 'numpy.ndarray' 对象没有属性 'lower' [英] CountVectorizer: AttributeError: 'numpy.ndarray' object has no attribute 'lower'

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

CountVectorizer: AttributeError: 'numpy.ndarray' 对象没有属性 'lower' [英] CountVectorizer: AttributeError: &#39;numpy.ndarray&#39; object has no attribute &#39;lower&#39;

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

CountVectorizer: AttributeError: 'numpy.ndarray' 对象没有属性 'lower' [英] CountVectorizer: AttributeError: 'numpy.ndarray' object has no attribute 'lower'

登录关闭