用于情感分析的训练数据 [英] Training data for sentiment analysis

查看:26
本文介绍了用于情感分析的训练数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以从哪里获得在企业领域中已被归类为正面/负面情绪的文档语料库?我想要大量为公司提供评论的文档,例如分析师和媒体提供的公司评论.

Where can I get a corpus of documents that have already been classified as positive/negative for sentiment in the corporate domain? I want a large corpus of documents that provide reviews for companies, like reviews of companies provided by analysts and media.

我发现有产品和电影评论的语料库.是否有业务领域的语料库,包括与业务语言相匹配的公司评论?

I find corpora that have reviews of products and movies. Is there a corpus for the business domain including reviews of companies, that match the language of business?

推荐答案

http://www.cs.cornell.edu/home/llee/data/

http://mpqa.cs.pitt.edu/corpora/mpqa_corpus

您可以使用带有笑脸的 twitter,如下所示:http://web.archive.org/web/20111119181304/http://deepthoughtinc.com/wp-content/uploads/2011/01/Twitter-as-a-Corpus-for-Sentiment-Analysis-and-Opinion-Mining.pdf

You can use twitter, with its smileys, like this: http://web.archive.org/web/20111119181304/http://deepthoughtinc.com/wp-content/uploads/2011/01/Twitter-as-a-Corpus-for-Sentiment-Analysis-and-Opinion-Mining.pdf

希望能帮助您入门.如果您对否定、情感范围等特定子任务感兴趣,那么文献中还有更多内容.

Hope that gets you started. There's more in the literature, if you're interested in specific subtasks like negation, sentiment scope, etc.

要关注公司,您可以将一种方法与主题检测配对,或者廉价地只是大量提及给定公司.或者您可以让 Mechanical Turkers 对您的数据进行注释.

To get a focus on companies, you might pair a method with topic detection, or cheaply just a lot of mentions of a given company. Or you could get your data annotated by Mechanical Turkers.

这篇关于用于情感分析的训练数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆