您如何在StanfordCoreNLP代码中使用GATE Twitter词性标记器作为模型? [英] How do you use the GATE Twitter part-of-speech tagger as a model in the StanfordCoreNLP code?
问题描述
您如何在StanfordCoreNLP代码中将GATE Twitter词性标记器用作模型?
How do you use the GATE Twitter part-of-speech tagger as a model in the StanfordCoreNLP code?
模型在这里: https://gate.ac.uk/wiki/twitter -postagger.html .但是,这些模型似乎不是StanfordCoreNLP格式的.
The models are here: https://gate.ac.uk/wiki/twitter-postagger.html . But, the models don't appear to be in the StanfordCoreNLP format.
我尝试从Gate下载模型文件,并将其放入我的类路径中.找到该文件,但没有正确的标题:
I tried downloading the model file from Gate, and putting in it my classpath. The file is found, but is does not have the right header:
Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
//props.put("pos.model", "gate-EN-twitter-fast.model");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
这是堆栈跟踪:
Reading POS tagger model from gate-EN-twitter-fast.model ... Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:558)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:260)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:127)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:123)
at com.rincaro.mapreduce.apps.StanfordCoreNlpDemo.main(StanfordCoreNlpDemo.java:31)
Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:857)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:755)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:289)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:253)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:88)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:76)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:556)
... 5 more
Caused by: java.io.StreamCorruptedException: invalid stream header: EFBFBDEF
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
at edu.stanford.nlp.tagger.maxent.TaggerConfig.readConfig(TaggerConfig.java:746)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:792)
推荐答案
我刚刚尝试了最近发行版(2014年4月11日,适用于v3.3.1)中的模型,使用此命令,效果很好:
I just tried the model from the recent release (2014 April 11 for v3.3.1) and things worked great with this command:
./corenlp.sh -file tweets.txt -pos.model gate-EN-twitter.model -ssplit.newlineIsSentenceBreak always
这篇关于您如何在StanfordCoreNLP代码中使用GATE Twitter词性标记器作为模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!