使用NLTK ieer或conll2000语料库为NER训练语料库 [英] Train corpus for NER with NLTK ieer or conll2000 corpus

查看：244 发布时间：2020/5/18 1:24:11 python nltk named-entity-recognition

本文介绍了使用NLTK ieer或conll2000语料库为NER训练语料库的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在尝试为特定域和新实体训练命名实体识别模型.似乎没有一个完整的合适的管道可用于此目的，因此有必要使用不同的软件包.

I have been trying to train a model for Named Entity Recognition for a specific domain, and with new entities. It seems there is not a completed suitable pipeline for this, and there is the need to use different packages.

我想给NLTK一个机会.我的问题是，如何使用ieer语料库训练NLTK NER对新实体进行分类和匹配?

I would like to give a chance to NLTK. My question is, how can I train a the NLTK NER to classify and match new entities using the ieer corpus?

我当然会提供IOB格式的训练数据，例如:

I will of course provide training data with the IOB-Format like:

We PRP B-NP
saw VBD O
the DT B-NP
yellow JJ I-NP
dog NN I-NP

我想我将不得不自己标记令牌.

I guess I will have to tag the tokens by myself.

当我拥有这种格式的文本文件时，下一步该怎么办?如何使用ieer语料库或更好的conll2000训练我的数据?

What do I do next when I have a text file in this format, what are the steps to train my data with the ieer corpus, or with a better one, conll2000?

我知道那里有一些文档，但是我不清楚在标记了训练语料库之后该怎么办.

I know there is some documentation out there, but it is not clear for me what to do after you have a training corpus tagged.

我想参加NLTK，因为然后我想使用 relextract()函数.

I want to go for NLTK because I then want to use the relextract() function.

请提出任何建议.

谢谢

使用NLTK ieer或conll2000语料库为NER训练语料库 [英] Train corpus for NER with NLTK ieer or conll2000 corpus

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用NLTK ieer或conll2000语料库为NER训练语料库 [英] Train corpus for NER with NLTK ieer or conll2000 corpus

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭