面对 AttributeError: 'list' 对象没有属性 'lower' [英] Facing AttributeError: 'list' object has no attribute 'lower'
本文介绍了面对 AttributeError: 'list' 对象没有属性 'lower'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我已经发布了我的样本训练数据和测试数据以及我的代码.我正在尝试使用朴素贝叶斯算法来训练模型.
I have posted my sample train data as well as test data along with my code. I'm trying to use Naive Bayes algorithm to train the model.
但是,在评论中我得到了列表.所以,我认为我的代码失败并出现以下错误:
But, in the reviews I'm getting list of list. So, I think my code is failing with the following error:
return lambda x: strip_accents(x.lower())
AttributeError: 'list' object has no attribute 'lower'
你们中的任何人都可以帮助我解决问题,因为我是 Python 新手....
Can anyone of you please help me out regarding the same as I'm new to python ....
review,label
Colors & clarity is superb,positive
Sadly the picture is not nearly as clear or bright as my 40 inch Samsung,negative
test.txt:
review,label
The picture is clear and beautiful,positive
Picture is not clear,negative
我的代码:
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import BernoulliNB
from sklearn.metrics import confusion_matrix
from sklearn.feature_extraction.text import CountVectorizer
def load_data(filename):
reviews = list()
labels = list()
with open(filename) as file:
file.readline()
for line in file:
line = line.strip().split(',')
labels.append(line[1])
reviews.append(line[0].split())
return reviews, labels
X_train, y_train = load_data('/Users/7000015504/Desktop/Sep_10/sample_train.csv')
X_test, y_test = load_data('/Users/7000015504/Desktop/Sep_10/sample_test.csv')
clf = CountVectorizer()
X_train_one_hot = clf.fit(X_train)
X_test_one_hot = clf.transform(X_test)
bnbc = BernoulliNB(binarize=None)
bnbc.fit(X_train_one_hot, y_train)
score = bnbc.score(X_test_one_hot, y_test)
print("score of Naive Bayes algo is :" , score)
推荐答案
我对您的代码进行了一些修改.下面发布的一个作品;我添加了关于如何调试您在上面发布的评论的评论.
I have applied a few modifications to your code. The one posted below works; I added comments on how to debug the one you posted above.
# These three will not used, do not import them
# from sklearn.preprocessing import MultiLabelBinarizer
# from sklearn.model_selection import train_test_split
# from sklearn.metrics import confusion_matrix
# This performs the classification task that you want with your input data in the format provided
from sklearn.naive_bayes import MultinomialNB
from sklearn.feature_extraction.text import CountVectorizer
def load_data(filename):
""" This function works, but you have to modify the second-to-last line from
reviews.append(line[0].split()) to reviews.append(line[0]).
CountVectorizer will perform the splits by itself as it sees fit, trust him :)"""
reviews = list()
labels = list()
with open(filename) as file:
file.readline()
for line in file:
line = line.strip().split(',')
labels.append(line[1])
reviews.append(line[0])
return reviews, labels
X_train, y_train = load_data('train.txt')
X_test, y_test = load_data('test.txt')
vec = CountVectorizer()
# Notice: clf means classifier, not vectorizer.
# While it is syntactically correct, it's bad practice to give misleading names to your objects.
# Replace "clf" with "vec" or something similar.
# Important! you called only the fit method, but did not transform the data
# afterwards. The fit method does not return the transformed data by itself. You
# either have to call .fit() and then .transform() on your training data, or just fit_transform() once.
X_train_transformed = vec.fit_transform(X_train)
X_test_transformed = vec.transform(X_test)
clf= MultinomialNB()
clf.fit(X_train_transformed, y_train)
score = clf.score(X_test_transformed, y_test)
print("score of Naive Bayes algo is :" , score)
这段代码的输出是:
score of Naive Bayes algo is : 0.5
这篇关于面对 AttributeError: 'list' 对象没有属性 'lower'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文