当我们训练自定义模型时,Spacy 使用哪种深度学习算法? [英] Which Deep Learning Algorithm does Spacy uses when we train Custom model?

查看:88
本文介绍了当我们训练自定义模型时,Spacy 使用哪种深度学习算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我们训练自定义模型时,我确实看到我们有 dropout 和 n_iter 参数需要调整,但是 Spacy 使用哪种深度学习算法来训练自定义模型?另外,当添加新的实体类型时,是创建空白还是在现有模型上训练它?

When we train custom model, I do see we have dropout and n_iter parameters to tune, but which deep learning algorithm does Spacy Uses to train Custom Models? Also, when Adding new Entity type is it good to create blank or train it on existing model?

推荐答案

spaCy 使用哪种学习算法?

spaCy 拥有自己的深度学习库,名为 thinc,用于不同的 NLP 模型.对于大多数(如果不是全部)任务,spaCy 使用基于 CNN 的深度神经网络,并进行了一些调整.专门用于命名实体识别,spacy 使用:

Which learning algorithm does spaCy use?

spaCy has its own deep learning library called thinc used under the hood for different NLP models. for most (if not all) tasks, spaCy uses a deep neural network based on CNN with a few tweaks. Specifically for Named Entity Recognition, spacy uses:

  1. 一种从 shift-reduce 解析器借用的 基于转换的方法,在论文 命名实体识别的神经架构,作者:Lample 等人.Matthew Honnibal 描述了 spaCy 如何在 YouTube 视频中使用它.

  1. A transition based approach borrowed from shift-reduce parsers, which is described in the paper Neural Architectures for Named Entity Recognition by Lample et al. Matthew Honnibal describes how spaCy uses this on a YouTube video.

一个名为"嵌入的框架.编码.参加.预测(从此处开始),幻灯片这里.

A framework that's called "Embed. Encode. Attend. Predict" (Starting here on the video), slides here.

  • 嵌入:使用布隆过滤器嵌入单词,这意味着单词哈希作为关键字保存在嵌入字典中,而不是单词本身.这维护了一个更紧凑的嵌入字典,单词可能会发生冲突并以相同的向量表示结束.

  • Embed: Words are embedded using a Bloom filter, which means that word hashes are kept as keys in the embedding dictionary, instead of the word itself. This maintains a more compact embeddings dictionary, with words potentially colliding and ending up with the same vector representations.

编码:单词列表被编码成一个句子矩阵,以考虑上下文.spaCy 使用 CNN 进行编码.

Encode: List of words is encoded into a sentence matrix, to take context into account. spaCy uses CNN for encoding.

参与:根据查询确定哪些部分的信息量更大,并获得特定于问题的表示.

Attend: Decide which parts are more informative given a query, and get problem specific representations.

预测:spaCy 使用多层感知器进行推理.

Predict: spaCy uses a multi layer perceptron for inference.

根据 Honnibal 的说法,此框架的优点是:

Advantages of this framework, per Honnibal are:

  1. 主要相当于序列标记(spaCy 提供的另一个任务模型)
  2. 与解析器共享代码
  3. 轻松排除无效序列
  4. 任意特征很容易定义

有关完整概述,Matthew Honnibal 描述了此 YouTube 视频中的模型.可以在此处找到幻灯片.

For a full overview, Matthew Honnibal describes how the model in this YouTube video. Slides could be found here.

注意:此信息基于 2017 年的幻灯片.此后引擎可能已更改.

Note: This information is based on slides from 2017. The engine might have changed since then.

理论上,在使用新实体微调 spaCy 模型时,您必须确保模型不会忘记先前学习实体的表示.如果可能,最好的办法是从头开始训练模型,但由于缺乏数据或资源,这可能并不容易或不可能.

Theoretically, when fine-tuning a spaCy model with new entities, you have to make sure the model doesn't forget representations for previously learned entities. The best thing, if possible, is to train a model from scratch, but that might not be easy or possible due to lack of data or resources.

EDIT Feb 2021:spaCy 版本 3 现在使用 Transformer 架构作为其深度学习模型.

EDIT Feb 2021: spaCy version 3 now uses the Transformer architecture as its deep learning model.

这篇关于当我们训练自定义模型时,Spacy 使用哪种深度学习算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆