斯坦福大学Corenlp情感训练集 [英] stanford corenlp sentiment training set

查看:149
本文介绍了斯坦福大学Corenlp情感训练集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是NLP领域尤其是情绪分析领域的新手.我的目标是训练Stanford CoreNLP情感模型.我知道作为训练数据提供的句子应采用以下格式.

I am new to the area of NLP and sentiment analysis in particular. My goal is to train the Stanford CoreNLP sentiment model. I am aware that the sentences provided as training data should be in the following format.

(3 (2 (2 The) (2 Rock)) (4 (3 (2 is) (4 (2 destined) (2 (2 (2 (2 (2 to) (2 (2 be) (2 (2 the) (2 (2 21st) (2 (2 (2 Century) (2 's)) (2 (3 new) (2 (2 ``) (2 Conan)))))))) (2 '')) (2 and)) (3 (2 that) (3 (2 he) (3 (2 's) (3 (2 going) (3 (2 to) (4 (3 (2 make) (3 (3 (2 a) (3 splash)) (2 (2 even) (3 greater)))) (2 (2 than) (2 (2 (2 (2 (1 (2 Arnold) (2 Schwarzenegger)) (2 ,)) (2 (2 Jean-Claud) (2 (2 Van) (2 Damme)))) (2 or)) (2 (2 Steven) (2 Segal))))))))))))) (2 .)))

我也知道我可以使用以下命令使用自己的训练数据创建情绪训练模型.

I am also aware that I can create the sentiment training model with my own training data using the following command.

java -mx8g edu.stanford.nlp.sentiment.SentimentTraining -numHid 25 -trainPath train.txt -devPath     dev.txt -train -model model.ser.gz

我的问题是,我是否可以访问用于训练模型的训练数据集?如果是,那我在哪里可以找到它? 另外,有什么方法可以将新句子附加到原始训练数据集上并创建训练模型?

My question is, do I have access to the training data set that was used to train the model? If yes, then where can I find it? Also, is there a way I can append new sentences to the original training data set and create the train model?

推荐答案

数据可在此处获取: http: //nlp.stanford.edu/sentiment/

如果仅创建具有相同格式的新数据集,则可以将文件放在目录中,并将-trainPath设置为该目录.它将加载该目录中的所有文件并对其进行训练.

If you just create a new data set with the same format you can put the files in a directory and set -trainPath to that directory. It will load all files from that directory and train on them.

示例命令:

java -Xmx8g edu.stanford.nlp.sentiment.SentimentTraining -train -numHid 25 -trainPath trees/training-data/ -model model.ser.gz

这篇关于斯坦福大学Corenlp情感训练集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆