Java中的StanfordNLP自定义模型 [英] StanfordNLP custom model in java

查看:236
本文介绍了Java中的StanfordNLP自定义模型的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是第一次使用斯坦福大学NLP.
到目前为止,这是我的代码:

I am using Stanford NLP for the first time.
Here is my code as of now:

Properties props = new Properties();
    props.setProperty("annotators", "tokenize,ssplit,pos,lemma,ner");
    props.setProperty("ner.additional.regexner.mapping", "additional.rules");
    //props.setProperty("ner.applyFineGrained", "false");

    StanfordCoreNLP pipeline = new StanfordCoreNLP(props);

    String content = "request count for www.abcd.com";
    CoreDocument doc = new CoreDocument(content);
    // annotate the document
    pipeline.annotate(doc);
    // view results
    System.out.println("---");
    System.out.println("entities found");
    for (CoreEntityMention em : doc.entityMentions())
      System.out.println("\tdetected entity: \t" + em.text() + "\t" + em.entityType());
    System.out.println("---");
    System.out.println("tokens and ner tags");
    String tokensAndNERTags =
        doc.tokens().stream().map(token -> "(" + token.word() + "," + token.ner() + ")")
            .collect(Collectors.joining(" "));
    System.out.println(tokensAndNERTags);

我已将属性ner.additional.regexner.mapping设置为包括自己的规则.

I have set property ner.additional.regexner.mapping to include my own rules.

规则文件(additional.rules)看起来像这样:

Rule File(additional.rules) looks somewhat like this:

request count   getReq
requestcount    getReq
server details  getSer
serverdetails   getSer

其中,getReq和getSer是相应单词的标记.

where getReq and getSer are tags for the corresponding words.

运行代码时,没有得到所需的输出.

When I am running my code, I am not getting the required output.

示例行必需-(www.abcd.com的请求计数):

Required for the sample line - (request count for www.abcd.com):

request count  ->  getReq

我得到的输出:

---
entities found
    detected entity:    count   TITLE
    detected entity:    www.abcd.com    URL
---
tokens and ner tags
(request,O) (count,TITLE) (for,O) (www.abcd.com,URL)

我做错了什么?
请帮忙.

What am I doing wrong?
Please Help.

推荐答案

好的,问题出在这一行:

Ok So the problem was in this line :

props.setProperty("ner.additional.regexner.mapping", "additional.rules");

我将其删除并添加了以下几行:

I removed it and added the following lines :

pipeline.addAnnotator(new TokensRegexNERAnnotator("additional.rules", true));

现在我得到所需的输出

这篇关于Java中的StanfordNLP自定义模型的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆