使用命名实体训练模型 [英] Train model using Named entity

查看：160 发布时间：2020/5/18 0:40:35 nlp stanford-nlp sentiment-analysis named-entity-recognition pos-tagger

本文介绍了使用命名实体训练模型的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用命名实体识别器来查看standford corenlp.我有不同种类的输入文本，需要将其标记到自己的实体中.因此，我开始训练自己的模型，但似乎无法正常工作.

I am looking on standford corenlp using the Named Entity REcognizer.I have different kinds of input text and i need to tag it into my own Entity.So i started training my own model and it doesnt seems to be working.

例如:我的输入文本字符串是有关丰田陆地巡洋舰1956-1987年黄金投资组合的49条杂志文章的书， http://t.co/EqxmY1VmLg http://t.co/F0Vefuoj9Q "

For eg: my input text string is "Book of 49 Magazine Articles on Toyota Land Cruiser 1956-1987 Gold Portfolio http://t.co/EqxmY1VmLg http://t.co/F0Vefuoj9Q"

我将通过示例来训练自己的模型，并仅查找我感兴趣的一些单词.

I go through the examples to train my own models and and look for only some words that I am interested in.

我的jane-austen-emma-ch1.tsv看起来像这样

My jane-austen-emma-ch1.tsv looks like this

Toyota  PERS
Land Cruiser    PERS

在上面的输入文本中，我仅对这两个单词感兴趣.一个是丰田(Toyota)和另一个词是Land Cruiser.

From the above input text i am only interested in those two words. The one is Toyota and the other word is Land Cruiser.

austin.prop看起来像这样

The austin.prop look like this

trainFile = jane-austen-emma-ch1.tsv
serializeTo = ner-model.ser.gz
map = word=0,answer=1
useClassFeature=true
useWord=true
useNGrams=true
noMidNGrams=true
useDisjunctive=true
maxNGramLeng=6
usePrev=true
useNext=true
useSequences=true
usePrevSequences=true
maxLeft=1
useTypeSeqs=true
useTypeSeqs2=true
useTypeySequences=true
wordShape=chris2useLC

运行以下命令以生成ner-model.ser.gz文件

Run the following command to generate the ner-model.ser.gz file

java -cp stanford-corenlp-3.4.1.jar edu.stanford.nlp.ie.crf.CRFClassifier -prop austen.prop

public static void main(String[] args) {
        String serializedClassifier = "edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz";
        String serializedClassifier2 = "C:/standford-ner/ner-model.ser.gz";
        try {
            NERClassifierCombiner classifier = new NERClassifierCombiner(false, false, 
                    serializedClassifier2,serializedClassifier);
            String ss = "Book of 49 Magazine Articles on Toyota Land Cruiser 1956-1987 Gold Portfolio http://t.co/EqxmY1VmLg http://t.co/F0Vefuoj9Q";
            System.out.println("---");
            List<List<CoreLabel>> out = classifier.classify(ss);
            for (List<CoreLabel> sentence : out) {
              for (CoreLabel word : sentence) {
                System.out.print(word.word() + '/' + word.get(AnswerAnnotation.class) + ' ');
              }
              System.out.println();
            }

        } catch (ClassCastException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }  catch (Exception e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

    }

这是我得到的输出

Book/PERS of/PERS 49/O Magazine/PERS Articles/PERS on/O Toyota/PERS Land/PERS Cruiser/PERS 1956-1987/PERS Gold/O Portfolio/PERS http://t.co/EqxmY1VmLg/PERS http://t.co/F0Vefuoj9Q/PERS

我认为这是错误的.我正在寻找Toyota/PERS和Land Cruiser/PERS(这是一个多价值的领域.

which i think its wrong.I am looking for Toyota/PERS and Land Cruiser/PERS(Which is a multi valued fied.

感谢您的帮助.非常感谢您提供帮助.

Thanks for the Help.Any help is really appreciated.

使用命名实体训练模型 [英] Train model using Named entity

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

使用命名实体训练模型 [英] Train model using Named entity

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭