网络的嵌入层是什么样的? [英] What does the embedding layer for a network looks like?

查看:89
本文介绍了网络的嵌入层是什么样的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是从文本分类开始,然后陷入了嵌入层.如果我有一批编码为与每个单词对应的整数的序列,那么嵌入层是什么样的?是否有像正常神经层一样的神经元?

I just start with text classification, and I got stuck in the embedding layer. If I have a batch of sequences encoded as integer corresponding to each word, what does the embedding layer looks like? Is there neurons like normal neural layer?

我看过keras.layers.Embedding,但是在查找文档后,我对它的工作原理感到非常困惑.我可以理解input_dim,但是为什么output_dim是2D矩阵?这个嵌入层中有多少个权重?

I've seen the keras.layers.Embedding, but after looking for the document I'm really confused about how does it works. I can understand input_dim, but why is output_dim a 2D matrix? How many weights do I have in this embedding layer?

很抱歉,如果我的问题不清楚,我没有NLP的经验,如果有关单词嵌入的问题是NLP的基本知识,请告诉我,我会进行检查.

I'm sorry if my question is not explained clearly, I've no experience in NLP, if this problem about word embedding is common basics in NLP, please tell me and I will check for it.

推荐答案

嵌入层只是一个可训练的查找表:它将整数索引作为输入,并将与该索引关联的词嵌入作为输出返回:

Embedding layer is just a trainable look-up table: it takes as input an integer index and returns as output the word embedding associated with that index:

index |                            word embeddings
=============================================================================
  0   |  word embedding for the word with index 0 (usually used for padding)
-----------------------------------------------------------------------------
  1   |  word embedding for the word with index 1
-----------------------------------------------------------------------------
  2   |  word embedding for the word with index 2
-----------------------------------------------------------------------------
  .   |
  .   |
  .   |
-----------------------------------------------------------------------------
  N   |  word embedding for the word with index N
-----------------------------------------------------------------------------

从某种意义上讲,它是可训练的,嵌入值不一定固定,可以在训练过程中更改. input_dim参数实际上是单词的数量(或更一般地说,是序列中不同元素的数量). output_dim参数指定每个单词嵌入的维数.例如,在使用output_dim=100的情况下,每个单词嵌入将是大小为100的向量.此外,由于嵌入层的输入是整数序列(对应于句子中的单词),因此其输出将具有形状(num_sequences, len_sequence, output_dim)的值,即对于序列中的每个整数,返回大小为output_dim的嵌入向量.

It is trainable in that sense the embeddings values are not necessarily fixed and could be changed during training. The input_dim argument is actually the number of words (or more generally the number of distinct elements in the sequences). The output_dim argument specifies the dimension of each word embedding. For example in case of using output_dim=100 each word embedding would be a vector of size 100. Further, since the input of an embedding layer is a sequence of integers (corresponding to the words in a sentence) therefore its output would have a shape of (num_sequences, len_sequence, output_dim), i.e. for each integer in a sequence an embedding vector of size output_dim is returned.

对于嵌入层中的权重数量,非常容易计算:存在input_dim个唯一索引,每个索引与大小为output_dim的词嵌入相关联.因此,嵌入层中的权重数为input_dim x ouput_dim.

As for the number of weights in an embedding layer it is very easy to calculate: there are input_dim unique indices and each index is associated with a word embedding of size output_dim. Therefore the number of weights in an embedding layer is input_dim x ouput_dim.

这篇关于网络的嵌入层是什么样的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆