用来训练自我注意机制的东西是什么? [英] What is used to train a self-attention mechanism?

查看:272
本文介绍了用来训练自我注意机制的东西是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在试图理解自我注意力,但是我发现的所有内容并不能很好地解释这个概念.

I've been trying to understand self-attention, but everything I found doesn't explain the concept on a high level very well.

假设我们在NLP任务中使用了自注意力,所以我们的输入就是一个句子.

Let's say we use self-attention in a NLP task, so our input is a sentence.

然后,自我注意力可以用来衡量句子中每个单词对每个其他单词的重要性".

Then self-attention can be used to measure how "important" each word in the sentence is for every other word.

问题是我不了解如何衡量重要性".重要的是什么?

The problem is that I do not understand how that "importance" is measured. Important for what?

训练自注意力算法中的权重的目标向量到底是什么?

What exactly is the goal vector the weights in the self-attention algorithm are trained against?

推荐答案

将具有潜在含义的语言连接起来称为基础.诸如球在桌子上"之类的句子产生了可以通过多模式学习进行复制的图像.多模式是指可以使用不同种类的词,例如事件,动作词,主题等.自我注意机制可以将输入向量映射到输出向量,并且它们之间是一个神经网络.神经网络的输出向量参考了实际情况.

Connecting language with underlying meaning is called grounding. A sentence like "The ball is on the table" results into an image which can be reproduced with multimodal learning. Multimodal means, that different kind of words are available for example events, action words, subjects and so on. A self-attention mechanism works with mapping input vector to output vectors and between them is a neural network. The output vector of the neural network is referencing to the grounded situation.

让我们举一个简短的例子.我们需要一个300x200的像素图像,我们需要一个自然语言的句子,并且需要一个解析器.解析器在两个方向上均起作用.他可以将文本转换为图像,这意味着将球在桌子上"的句子转换为300x200图像.但是也可以解析给定的图像,然后提取自然句.自我注意学习是学习和使用扎根关系的一种引导技术.这意味着要验证现有的语言模型,学习新的语言模型并预测未来的系统状态.

Let us make a short example. We need a pixel image which is 300x200, we need a sentence in natural language and we need a parser. The parser works in both directions. He can convert text to image, that means the sentence "The ball is on the table" gets converted into the 300x200 image. But it is also possible to parse a given image and extract the natural sentence back. Self-attention learning is a bootstrapping technique to learn and use the grounded relationship. That means to verify existing language models, to learn new one and to predict future system states.

这篇关于用来训练自我注意机制的东西是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆