情绪分析中的否定处理 [英] Negation handling in sentiment analysis
问题描述
我在这里需要一点帮助,我需要确定负面词,例如不好",不错",然后确定情绪的极性(负向或正向).除了处理否定之外,我已做了所有事情.我只想知道如何将否定词包括进去.我该怎么办?
I am in need of a little help here, I need to identify the negative words like "not good","not bad" and then identify the polarity (negative or positive) of the sentiment. I did everything except handling the negations. I just want to know how I can include negations into it. How do I go about it?
推荐答案
否定处理是一个广阔的领域,具有许多不同的潜在实现方式.在这里,我可以提供示例代码,该代码可否定文本序列并以not_
形式存储否定的uni/bi/trigram.请注意,此处未使用nltk
来支持简单的文本处理.
Negation handling is quite a broad field, with numerous different potential implementations. Here I can provide sample code that negates a sequence of text and stores negated uni/bi/trigrams in not_
form. Note that nltk
isn't used here in favor of simple text processing.
# negate_sequence(text)
# text: sentence to process (creation of uni/bi/trigrams
# is handled here)
#
# Detects negations and transforms negated words into 'not_' form
#
def negate_sequence(text):
negation = False
delims = "?.,!:;"
result = []
words = text.split()
prev = None
pprev = None
for word in words:
stripped = word.strip(delims).lower()
negated = "not_" + stripped if negation else stripped
result.append(negated)
if prev:
bigram = prev + " " + negated
result.append(bigram)
if pprev:
trigram = pprev + " " + bigram
result.append(trigram)
pprev = prev
prev = negated
if any(neg in word for neg in ["not", "n't", "no"]):
negation = not negation
if any(c in word for c in delims):
negation = False
return result
如果我们在示例输入text = "I am not happy today, and I am not feeling well"
上运行该程序,则将获得以下由字母组合,双字母组和字母组合组成的序列:
If we run this program on a sample input text = "I am not happy today, and I am not feeling well"
, we obtain the following sequences of unigrams, bigrams, and trigrams:
[ 'i',
'am',
'i am',
'not',
'am not',
'i am not',
'not_happy',
'not not_happy',
'am not not_happy',
'not_today',
'not_happy not_today',
'not not_happy not_today',
'and',
'not_today and',
'not_happy not_today and',
'i',
'and i',
'not_today and i',
'am',
'i am',
'and i am',
'not',
'am not',
'i am not',
'not_feeling',
'not not_feeling',
'am not not_feeling',
'not_well',
'not_feeling not_well',
'not not_feeling not_well']
我们随后可能会将这些三字母组合存储在数组中,以供将来进行重新分析和分析.将not_
单词当作您为其对应词所定义的[情感,极性]的否定词.
We may subsequently store these trigrams in an array for future retreival and analysis. Process the not_
words as negative of the [sentiment, polarity] that you have defined for their counterparts.
这篇关于情绪分析中的否定处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!