您如何在StanfordCoreNLP代码中使用GATE Twitter词性标记器作为模型? [英] How do you use the GATE Twitter part-of-speech tagger as a model in the StanfordCoreNLP code?

查看:118
本文介绍了您如何在StanfordCoreNLP代码中使用GATE Twitter词性标记器作为模型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您如何在StanfordCoreNLP代码中将GATE Twitter词性标记器用作模型?

How do you use the GATE Twitter part-of-speech tagger as a model in the StanfordCoreNLP code?

模型在这里: https://gate.ac.uk/wiki/twitter -postagger.html .但是,这些模型似乎不是StanfordCoreNLP格式的.

The models are here: https://gate.ac.uk/wiki/twitter-postagger.html . But, the models don't appear to be in the StanfordCoreNLP format.

我尝试从Gate下载模型文件,并将其放入我的类路径中.找到该文件,但没有正确的标题:

I tried downloading the model file from Gate, and putting in it my classpath. The file is found, but is does not have the right header:

Properties props = new Properties();
props.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse, dcoref");
//props.put("pos.model", "gate-EN-twitter-fast.model");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

这是堆栈跟踪:

    Reading POS tagger model from gate-EN-twitter-fast.model ... Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:558)
at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:260)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:127)
at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:123)
at com.rincaro.mapreduce.apps.StanfordCoreNlpDemo.main(StanfordCoreNlpDemo.java:31)

    Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:857)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:755)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:289)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:253)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:88)
at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:76)
at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:556)
... 5 more

    Caused by: java.io.StreamCorruptedException: invalid stream header: EFBFBDEF
at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802)
at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
at edu.stanford.nlp.tagger.maxent.TaggerConfig.readConfig(TaggerConfig.java:746)
at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:792)

推荐答案

我刚刚尝试了最近发行版(2014年4月11日,适用于v3.3.1)中的模型,使用此命令,效果很好:

I just tried the model from the recent release (2014 April 11 for v3.3.1) and things worked great with this command:

./corenlp.sh -file tweets.txt -pos.model gate-EN-twitter.model -ssplit.newlineIsSentenceBreak always

这篇关于您如何在StanfordCoreNLP代码中使用GATE Twitter词性标记器作为模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆