文本分析-无法在csv或xls文件中写入Python程序的输出 [英] Text analysis-Unable to write output of Python program in csv or xls file
问题描述
我正在尝试使用python 2.x中的朴素贝叶斯分类器进行情感分析.它使用txt文件读取情感,然后根据示例txt文件情感给出正或负输出. 我想要输出与输入相同的形式,例如我有一个让let坐着1000个原始情感的文本文件,并且我希望输出对每个情感都显示正面还是负面. 请帮忙. 下面是我正在使用的代码
Hi I am trying to do a sentiment analysis using Naive Bayes classifier in python 2.x. It reads the sentiment using a txt file and then gives output as positive or negative based on the sample txt file sentiments. I want the output the same form as input e.g. I have a text file of lets sat 1000 raw sentiments and I want the output to show positive or negative against each sentiment. Please help. Below is the code i am using
import math
import string
def Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, test_string):
y_values = [0,1]
prob_values = [None, None]
for y_value in y_values:
posterior_prob = 1.0
for word in test_string.split():
word = word.lower().translate(None,string.punctuation).strip()
if y_value == 0:
if word not in negative:
posterior_prob *= 0.0
else:
posterior_prob *= negative[word]
else:
if word not in positive:
posterior_prob *= 0.0
else:
posterior_prob *= positive[word]
if y_value == 0:
prob_values[y_value] = posterior_prob * float(total_negative) / (total_negative + total_positive)
else:
prob_values[y_value] = posterior_prob * float(total_positive) / (total_negative + total_positive)
total_prob_values = 0
for i in prob_values:
total_prob_values += i
for i in range(0,len(prob_values)):
prob_values[i] = float(prob_values[i]) / total_prob_values
print prob_values
if prob_values[0] > prob_values[1]:
return 0
else:
return 1
if __name__ == '__main__':
sentiment = open(r'C:/Users/documents/sample.txt')
#Preprocessing of training set
vocabulary = {}
positive = {}
negative = {}
training_set = []
TOTAL_WORDS = 0
total_negative = 0
total_positive = 0
for line in sentiment:
words = line.split()
y = words[-1].strip()
y = int(y)
if y == 0:
total_negative += 1
else:
total_positive += 1
for word in words:
word = word.lower().translate(None,string.punctuation).strip()
if word not in vocabulary and word.isdigit() is False:
vocabulary[word] = 1
TOTAL_WORDS += 1
elif word in vocabulary:
vocabulary[word] += 1
TOTAL_WORDS += 1
#Training
if y == 0:
if word not in negative:
negative[word] = 1
else:
negative[word] += 1
else:
if word not in positive:
positive[word] = 1
else:
positive[word] += 1
for word in vocabulary.keys():
vocabulary[word] = float(vocabulary[word])/TOTAL_WORDS
for word in positive.keys():
positive[word] = float(positive[word])/total_positive
for word in negative.keys():
negative[word] = float(negative[word])/total_negative
test_string = raw_input("Enter the review: \n")
classifier = Naive_Bayes_Classifier(positive, negative, total_negative, total_positive, test_string)
if classifier == 0:
print "Negative review"
else:
print "Positive review"
推荐答案
我已经检查了您在评论中发布的github存储库.我尝试运行该项目,但出现一些错误.
I've checked the github repo posted by you in comments. I tried to run the project, but I have some errors.
无论如何,我已经检查了项目结构和用于训练朴素贝叶斯算法的文件,并且我认为以下代码段可以用于将结果数据写入Excel文件(即.xls)
Anyway, I've checked the project structure and the file used to training the naive bayes algorithm, and I think that the following piece of code can be used to write your result data in a Excel file (i.e. .xls)
with open("test11.txt") as f:
for line in f:
classifier = naive_bayes_classifier(positive, negative, total_negative, total_positive, line)
result = 'Positive' if classifier == 0 else 'Negative'
data_to_be_written += ([line, result],)
# Create a workbook and add a worksheet.
workbook = xlsxwriter.Workbook('test.xls')
worksheet = workbook.add_worksheet()
# Start from the first cell. Rows and columns are zero indexed.
row = 0
col = 0
# Iterate over the data and write it out row by row.
for item, cost in data_to_be_written:
worksheet.write(row, col, item)
worksheet.write(row, col + 1, cost)
row += 1
workbook.close()
或者,对于包含要测试的句子的文件的每一行,我调用分类器并准备一个将要写入csv文件的结构.
然后循环结构并写入xls文件.
为此,我使用了一个名为xlsxwriter的python网站包.
Sorthly, for each row of the file with the sentences to be tested, I call the classifier and prepare a structure that will be written in the csv file.
Then loop the structure and write the xls file.
To do this I have used a python site package called xlsxwriter.
正如我之前告诉您的那样,我在运行项目时遇到了一些问题,因此该代码也未经过测试.不管怎样,它应该能很好地工作,如果您遇到麻烦,请告诉我.
As I told you before, I have some problem to run the project, so this code is not tested as well. It should be works well, bu anyway, if you are in trouble, let me know.
致谢
这篇关于文本分析-无法在csv或xls文件中写入Python程序的输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!