是否可以训练Stanford NER系统以识别更多命名实体类型? [英] Is it possible to train Stanford NER system to recognize more named entities types?

查看:78
本文介绍了是否可以训练Stanford NER系统以识别更多命名实体类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我现在正在使用一些NLP库(stanford和nltk) 斯坦福大学,我看到了演示部分,但只是想问一问是否有可能使用它来识别更多的实体类型.

I'm using some NLP libraries now, (stanford and nltk) Stanford I saw the demo part but just want to ask if it possible to use it to identify more entity types.

因此,当前斯坦福NER系统(如演示所示)可以将实体识别为人(名称),组织或位置.但是公认的组织仅限于大学或某些大型组织.我想知道是否可以使用其API编写更多实体类型的程序,例如,如果我输入的是"Apple"或"Square",它可以将其识别为公司.

So currently stanford NER system (as the demo shows) can recognize entities as person(name), organization or location. But the organizations recognized are limited to universities or some, big organizations. I'm wondering if I can use its API to write program for more entity types, like if my input is "Apple" or "Square" it can recognize it as a company.

我必须建立自己的训练数据集吗?

Do I have to make my own training dataset?

此外,如果我想提取实体及其之间的关系,我觉得我应该使用stanford依赖解析器. 我的意思是,首先提取命名实体和其他标记为名词"的部分,然后找到它们之间的关系.

Further more, if I ever want to extract entities and their relationships between each other, I feel I should use the stanford dependency parser. I mean, extract first the named entities and other parts tagged as "noun" and find relations between them.

我正确了.

谢谢.

推荐答案

是的,您需要自己的训练集.预先训练的Stanford模型仅将"Stanford"一词识别为已命名实体,因为它们已经针对具有该词的数据进行了训练(或者根据使用的功能集非常相似的词,我不知道那是什么)标记为命名实体.

Yes, you need your own training set. The pre-trained Stanford models only recognise the word "Stanford" as a named entity because they have been trained on data that had that word (or very similar words according to the feature set they use, I don't know what that is) marked as a named entity.

一旦您拥有更多数据,就需要按照

Once you have more data, you need to put it in the right format described in this question and the Stanford tutorial.

这篇关于是否可以训练Stanford NER系统以识别更多命名实体类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆