如何在RASA NLU中使用Hindi模型? [英] How to use Hindi Model in RASA NLU?

查看:199
本文介绍了如何在RASA NLU中使用Hindi模型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用带有spacy后端的FastText为印地语建立了模型. 我按照本教程使用FastText构建模型.

I have build my model for Hindi language using FastText with spacy backend. I followed this tutorial to to build my model using FastText.

此URL

我还通过以下命令将我的模型与spacy链接

I have also linked my model with spacy by following command

python -m spacy link nl_model hi

模型已成功链接,您可以在下图中签入

Model is linked successfully you can check in the image below

现在我在使用印地文语言方面找不到任何帮助,例如我需要使用哪种配置文件,在何处导入印地文模型以及如何进行? 我也有一个疑问,例如印地文的data.json文件是什么样子,以及我们将如何使用实体和意图,实体和意图的名称也应该用印地语还是英语? 可以帮忙进一步处理吗?我被困在这里. 我必须仅使用RASA Stack在印地文中构建一个ChatBot.

Now I am not finding any help for using hindi language, Like what kind of config files do I need to use, where to import hindi model and how to proceed now? I also have question like how our data.json file look like for the hindi and how we will use entities and intents, name of the entities and intents should also be in Hindi or in English? Can some one help to process further? I am stuck here. I have to build a ChatBot in hindi using RASA Stack only.

提前谢谢....

推荐答案

似乎您已经使用spaCy成功地学习了hi模型.下一步是编写如下配置文件:

It seems that you have successfully learned hi model using spaCy. The next step is to write a config file like:

language: "hi"

pipeline:
- name: "tokenizer_whitespace"
- name: "ner_crf"
- name: "ner_synonyms"
- name: "intent_featurizer_count_vectors"
- name: "intent_classifier_tensorflow_embedding"

如果刚刚学习的hi模型也具有令牌生成器,则可以将tokenizer_whitespace替换为tokenizer_spacy.

If your hi model which you just learned also have tokenizer, you can replace tokenizer_whitespace with tokenizer_spacy.

我应该提到基于tensorflow的rasa的新意图分类器不需要您的hi模型的字向量,它会从头开始提取字向量,请参见

I should mention that the new intent classifier of rasa which is based on tensorflow does not need wordvectors of your hi model, it extract the wordevectors from scratch, see here. For the entity extraction you also don't need the hi model, just tokenizer do the stuffs for you! So, in overall, you can have your bot even without hi model!

培训数据文件应为json或markdown,如 doc .我认为您的意图和实体的名称应使用英语,但很明显,示例查询可以使用任何utf-8语言(例如印地文).

The training data file should can be json or markdown as fully explained in doc. I think the name of your intents and entities should be in English but it is clear that the sample queries can be in any utf-8 language like hindi.

然后,您可以使用文档中介绍的不同方法来学习模型. 例如:

Then you can learn your model using different methods which explained in doc. for example:

python3 -m rasa_nlu.train \
    --config YOUR_CONFIG_FILE.yml \
    --data YOUR_TRAIN_DATA.json \
    --path PATH_TO_SAVE_MODEL

您可以在文档中找到快速入门.

You can find a good quick start in doc.

这篇关于如何在RASA NLU中使用Hindi模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆