如何处理NLP中的两种实体提取方法 [英] How to handle two entity extraction methods in NLP

查看：538 发布时间：2020/5/18 0:35:06 nlp entity rasa-nlu

本文介绍了如何处理NLP中的两种实体提取方法的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用两种不同的实体提取方法( https://rasa.com/docs/nlu /entities/)，同时在RASA框架中构建我的NLP模型以构建聊天机器人. 机器人应处理具有自定义实体以及位置或组织等一般性问题的不同问题. 因此，我同时使用了ner_spacy和ner_crf两个组件来创建模型.之后，我在python中构建了一个小的帮助程序脚本来评估模型性能.在那里，我注意到该模型难以选择正确的实体.

I am using two different entity extraction methods (https://rasa.com/docs/nlu/entities/) while building my NLP model in the RASA framework to build a chatbot. The bot should handle different questions which have custom entities as well as some general ones like location or organisation. So I use both components ner_spacy and ner_crf to create the model. After that I build a small helper script in python to evaluate the model performance. There I noticed that the model struggles to choose the correct enity.

例如，对于单词"X"，它从SpaCy中选择了预定义的实体"ORG"，但应将其识别为我在训练数据中定义的自定义实体.

For example for a word 'X' it choosed the pre-defined enity 'ORG' from SpaCy, but it should be recogniced as a custom enity which I defined in the training data.

如果仅使用ner_crf提取程序，则在识别位置实体(例如首都)时会遇到巨大的问题.我最大的问题之一还是单答案实体.

If I just use the ner_crf extractor I face huge problems in identifing location enities like capitals. Also one of my biggest problems are single answer enities.

问:您最喜欢的动物是什么?"

Q : "What´s your favourite animal?"

A:狗

我的模型无法为此单一答案提取该单一实体动物".如果我用狗"之类的两个词回答该问题，则该模型将不会提取具有狗"值的动物实体.

My model is not able to extract this single entity 'animal' for this single answer. If I answer this question with two words like 'The Dog', the model has no problems to extract the animal entity with the value 'Dog'.

所以我的问题是，使用两个不同的组件提取实体是否明智?一个用于自定义实体，另一个用于预定义实体. 如果我使用两种方法，那么在模型中使用哪种提取器的机制是什么?

So my question is, is it clever to use two different components to extract entities? One for custom enities and the other one for pre-defined enities. If I use two methods, what´s the mechanism in the model which extractor is used?

顺便说一句，目前我只是在测试东西，所以我的训练样本并没有那么大(少于100个例子).如果我有更多的培训示例，是否可以解决问题?

By the way, currently I´m just testing things out, so my training samples are not that huge it should be (less then 100 examples). Could the problem been solved if I have much more training examples?

如何处理NLP中的两种实体提取方法 [英] How to handle two entity extraction methods in NLP

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何处理NLP中的两种实体提取方法 [英] How to handle two entity extraction methods in NLP

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭