情感分析的最佳算法 [英] Best Algorithmic Approach to Sentiment Analysis

查看:467
本文介绍了情感分析的最佳算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的要求是接受新闻报道,并确定它们对某个主题是正面还是负面.我采用的是下面概述的方法,但我继续阅读NLP可能在这里有用.我所读的所有内容都指向NLP从事实中发现意见,我认为这对我来说并没有多大意义.我想知道两件事:

My requirement is taking in news articles and determining if they are positive or negative about a subject. I am taking the approach outlined below, but I keep reading NLP may be of use here. All that I have read has pointed at NLP detecting opinion from fact, which I don't think would matter much in my case. I'm wondering two things:

1)为什么我的算法不起作用和/或如何改进它? (我知道讽刺可能是一个陷阱,但我再也看不到这种情况会在我们得到的新闻类型中太多发生)

1) Why wouldn't my algorithm work and/or how can I improve it? ( I know sarcasm would probably be a pitfall, but again I don't see that occurring much in the type of news we will be getting)

2)NLP将如何提供帮助,为什么我应该使用它?

2) How would NLP help, why should I use it?

我的算法方法(我有肯定,否定和否定词的词典):

My algorithmic approach (I have dictionaries of positive, negative, and negation words):

1)计算文章中正负词的数量

1) Count number of positive and negative words in article

2)如果发现一个否定词与正词或负词中的2或3个词(即:不是最好的词)相抵消,则得分为

2) If a negation word is found with 2 or 3 words of the positive or negative word, (ie: NOT the best) negate the score.

3)将分数乘以已手动分配给每个单词的权重. (从1.0开始)

3) Multiply the scores by weights that have been manually assigned to each word. (1.0 to start)

4)将正数和负数的总和相加以获得情感分数.

4) Add up the totals for positive and negative to get the sentiment score.

推荐答案

我认为您的算法没有特别的 错误 ,这是一个相当简单且实用的算法可行的方法,但是在很多情况下它会出错.

I don't think there's anything particularly wrong with your algorithm, it's a fairly straightforward and practical way to go, but there are a lot of situations where it will get make mistakes.

  1. 情感词含糊不清-此产品非常有效"与此产品非常好"

  1. Ambiguous sentiment words - "This product works terribly" vs. "This product is terribly good"

遗漏的否定语-数百万年来,我永远不会说这款产品值得购买"

Missed negations - "I would never in a millions years say that this product is worth buying"

引用/间接文本-我父亲说此产品很糟糕,但我不同意"

Quoted/Indirect text - "My dad says this product is terrible, but I disagree"

比较-该产品与头部的孔一样有用"

Comparisons - "This product is about as useful as a hole in the head"

任何细微之处-该产品丑陋,缓慢且令人鼓舞,但这是市场上唯一可以完成此任务的东西"

Anything subtle - "This product is ugly, slow and uninspiring, but it's the only thing on the market that does the job"

我将产品评论用作示例,而不是新闻报道,但您明白了.实际上,新闻报道可能会更难,因为它们经常尝试展示论点的两面,并倾向于使用某种风格来表达观点.例如,最后的例子在意见片中很常见.

I'm using product reviews for examples instead of news stories, but you get the idea. In fact, news articles are probably harder because they will often try to show both sides of an argument and tend to use a certain style to convey a point. The final example is quite common in opinion pieces, for example.

就NLP可以帮助您解决上述问题中的任何一项而言,单词歧义消除(甚至只是词性标记)可能有助于解决(1),语法解析可能有助于解决(2)中的远程依赖,某种

As far as NLP helping you with any of this, word sense disambiguation (or even just part-of-speech tagging) may help with (1), syntactic parsing might help with the long range dependencies in (2), some kind of chunking might help with (3). It's all research level work though, there's nothing that I know of that you can directly use. Issues (4) and (5) are a lot harder, I throw up my hands and give up at this point.

我会坚持使用您所采用的方法,并仔细查看输出以查看其是否在执行您想要的操作.当然,这会引发一个问题,即您首先要了解情感"的定义...

I'd stick with the approach you have and look at the output carefully to see if it is doing what you want. Of course that then raises the issue of what you want you understand the definition of "sentiment" to be in the first place...

这篇关于情感分析的最佳算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆