情绪分析中的否定处理 [英] Negation handling in sentiment analysis

查看:117
本文介绍了情绪分析中的否定处理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里需要一点帮助,我需要确定负面词,例如不好",不错",然后确定情绪的极性(负向或正向).除了处理否定之外,我已做了所有事情.我只想知道如何将否定词包括进去.我该怎么办?

I am in need of a little help here, I need to identify the negative words like "not good","not bad" and then identify the polarity (negative or positive) of the sentiment. I did everything except handling the negations. I just want to know how I can include negations into it. How do I go about it?

推荐答案

否定处理是一个广阔的领域,具有许多不同的潜在实现方式.在这里,我可以提供示例代码,该代码可否定文本序列并以not_形式存储否定的uni/bi/trigram.请注意,此处未使用nltk来支持简单的文本处理.

Negation handling is quite a broad field, with numerous different potential implementations. Here I can provide sample code that negates a sequence of text and stores negated uni/bi/trigrams in not_ form. Note that nltk isn't used here in favor of simple text processing.

# negate_sequence(text)
#   text: sentence to process (creation of uni/bi/trigrams
#    is handled here)
#
# Detects negations and transforms negated words into 'not_' form
#
def negate_sequence(text):
    negation = False
    delims = "?.,!:;"
    result = []
    words = text.split()
    prev = None
    pprev = None
    for word in words:
        stripped = word.strip(delims).lower()
        negated = "not_" + stripped if negation else stripped
        result.append(negated)
        if prev:
            bigram = prev + " " + negated
            result.append(bigram)
            if pprev:
                trigram = pprev + " " + bigram
                result.append(trigram)
            pprev = prev
        prev = negated

        if any(neg in word for neg in ["not", "n't", "no"]):
            negation = not negation

        if any(c in word for c in delims):
            negation = False

    return result

如果我们在示例输入text = "I am not happy today, and I am not feeling well"上运行该程序,则将获得以下由字母组合,双字母组和字母组合组成的序列:

If we run this program on a sample input text = "I am not happy today, and I am not feeling well", we obtain the following sequences of unigrams, bigrams, and trigrams:

[   'i',
    'am',
    'i am',
    'not',
    'am not',
    'i am not',
    'not_happy',
    'not not_happy',
    'am not not_happy',
    'not_today',
    'not_happy not_today',
    'not not_happy not_today',
    'and',
    'not_today and',
    'not_happy not_today and',
    'i',
    'and i',
    'not_today and i',
    'am',
    'i am',
    'and i am',
    'not',
    'am not',
    'i am not',
    'not_feeling',
    'not not_feeling',
    'am not not_feeling',
    'not_well',
    'not_feeling not_well',
    'not not_feeling not_well']

我们随后可能会将这些三字母组合存储在数组中,以供将来进行重新分析和分析.将not_单词当作您为其对应词所定义的[情感,极性]的否定词.

We may subsequently store these trigrams in an array for future retreival and analysis. Process the not_ words as negative of the [sentiment, polarity] that you have defined for their counterparts.

这篇关于情绪分析中的否定处理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆