格式化来自Stanford Corenlp的NER输出 [英] Formatting NER output from Stanford Corenlp

查看：203 发布时间：2020/8/6 3:02:03 stanford-nlp

本文介绍了格式化来自Stanford Corenlp的NER输出的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在与Stanford CoreNLP合作，并将其用于NER.但是，当我提取组织名称时，我看到每个单词都带有注释标记.因此，如果该实体是纽约时间"，那么它将被记录为三个不同的实体:"NEW"，"YORK"和"TIMES".我们可以在Stanford COreNLP中设置一个属性，以便我们将合并后的输出作为实体吗?

I am working with Stanford CoreNLP and using it for NER. But when I extract organization names, I see that each word is tagged with the annotation. So, if the entity is "NEW YORK TIMES", then it is getting recorded as three different entities : "NEW", "YORK" and "TIMES". Is there a property we can set in the Stanford COreNLP so that we could get the combined output as the entity ?

就像在Stanford NER中一样，当我们使用命令行实用程序时，我们可以选择以下输出格式:inlineXML?我们可以以某种方式设置一个属性来选择Stanford CoreNLP中的输出格式吗?

Just like in Stanford NER, when we use command line utility, we can choose out output format as : inlineXML ? Can we somehow set a property to select the output format in Stanford CoreNLP ?

推荐答案

如果只需要斯坦福大学NER找到的每个命名实体的完整字符串，请尝试以下操作:

If you just want the complete strings of each named entity found by Stanford NER, try this:

String text = "<INSERT YOUR INPUT TEXT HERE>";
AbstractSequenceClassifier<CoreMap> ner = CRFClassifier.getDefaultClassifier();
List<Triple<String, Integer, Integer>> entities = ner.classifyToCharacterOffsets(text);
for (Triple<String, Integer, Integer> entity : entities)
    System.out.println(text.substring(entity.second, entity.third), entity.second));

如果您想知道，实体类由entity.first表示.

In case you're wondering, the entity class is indicated by entity.first.

或者，您可以使用ner.classifyWithInlineXML(text)来获取类似于<PERSON>Bill Smith</PERSON> went to <LOCATION>Paris</LOCATION> .

Alternatively, you can use ner.classifyWithInlineXML(text) to get output that looks like <PERSON>Bill Smith</PERSON> went to <LOCATION>Paris</LOCATION> .

这篇关于格式化来自Stanford Corenlp的NER输出的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

格式化来自Stanford Corenlp的NER输出 [英] Formatting NER output from Stanford Corenlp

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

格式化来自Stanford Corenlp的NER输出 [英] Formatting NER output from Stanford Corenlp

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭