如何在Stanford CoreNLP Server中使用自定义TokensRegex规则注释器? [英] How to use a custom TokensRegex rules annotator with Stanford CoreNLP Server?

查看:258
本文介绍了如何在Stanford CoreNLP Server中使用自定义TokensRegex规则注释器?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

通过命令行使用CoreNLP时,TokensRegex规则颜色注释器(stanford-corenlp-full-2016-10-31/tokensregex/color.rules.txt)成功加载,但对于具有java.lang.IllegalArgumentException: Unknown annotator: color的Web服务器,则失败.

The TokensRegex rules color annotator (stanford-corenlp-full-2016-10-31/tokensregex/color.rules.txt) loads successfully when using CoreNLP through command line but fails for the web server with java.lang.IllegalArgumentException: Unknown annotator: color.

设置

# custom.properties
annotators=tokenize,ssplit,pos,lemma,ner,regexner,color
customAnnotatorClass.color = edu.stanford.nlp.pipeline.TokensRegexAnnotator
color.rules = tokensregex/color.rules.txt

命令行

$ java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props custom.properties -file ./tokensregex/color.input.txt -outputFormat text
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Registering annotator color with class edu.stanford.nlp.pipeline.TokensRegexAnnotator
...
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator color
[main] INFO edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor - Reading TokensRegex rules from tokensregex/color.rules.txt
[main] INFO edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor - Read 7 rules

# color.input.txt.output
Sentence #1 (9 tokens):
Both blue and light blue are nice colors.
[Text=Both CharacterOffsetBegin=0 CharacterOffsetEnd=4 PartOfSpeech=CC Lemma=both NamedEntityTag=O]
[Text=blue CharacterOffsetBegin=5 CharacterOffsetEnd=9 PartOfSpeech=JJ Lemma=blue NamedEntityTag=COLOR NormalizedNamedEntityTag=#0000FF]
...

服务器

  1. java -mx2g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -c custom.properties
  2. wget --post-data 'Both blue and light blue are nice colors.' 'localhost:9000/?properties={"annotators":"tokenize,ssplit,pos,lemma,ner,regexner,color","outputFormat":"json"}' -O -

HTTP request sent, awaiting response... 500 Internal Server Error
    2016-11-05 14:41:27 ERROR 500: Internal Server Error.

java.lang.IllegalArgumentException: Unknown annotator: color
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.ensurePrerequisiteAnnotators(StanfordCoreNLP.java:304)
    at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.getProperties(StanfordCoreNLPServer.java:713)
    at edu.stanford.nlp.pipeline.StanfordCoreNLPServer$CoreNLPHandler.handle(StanfordCoreNLPServer.java:540)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
    at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
    at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
    at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

解决方案

在请求中包括自定义注释器属性:wget --post-data 'Both blue and light blue are nice colors.' 'localhost:9000/?properties={"color.rules":"tokensregex/color.rules.txt","customAnnotatorClass.color":"edu.stanford.nlp.pipeline.TokensRegexAnnotator","annotators":"tokenize,ssplit,pos,lemma,ner,regexner,color","enforceRequirements":"false","outputFormat":"json"}' -O -

Include custom annotator properties in the request: wget --post-data 'Both blue and light blue are nice colors.' 'localhost:9000/?properties={"color.rules":"tokensregex/color.rules.txt","customAnnotatorClass.color":"edu.stanford.nlp.pipeline.TokensRegexAnnotator","annotators":"tokenize,ssplit,pos,lemma,ner,regexner,color","enforceRequirements":"false","outputFormat":"json"}' -O -

推荐答案

添加

"enforceRequirements":"false"

应您的要求,这应该会阻止该错误!

to your request and that should stop this error!

这篇关于如何在Stanford CoreNLP Server中使用自定义TokensRegex规则注释器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆