举例说明:在keras中嵌入图层的工作方式 [英] Explain with example: how embedding layers in keras works

查看:90
本文介绍了举例说明:在keras中嵌入图层的工作方式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不了解Keras的嵌入层.尽管有很多文章对此进行了解释,但我仍然感到困惑.例如,以下代码来自imdb情绪分析:

I don't understand the Embedding layer of Keras. Although there are lots of articles explaining it, I am still confused. For example, the code below isfrom imdb sentiment analysis:

top_words = 5000
max_review_length = 500
embedding_vecor_length = 32    

model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length))
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, nb_epoch=3, batch_size=64)

在这段代码中,嵌入层到底在做什么?嵌入层的输出是什么?如果有人可以用一些示例来解释它,那就太好了!

In this code, what exactly is the embedding layer doing? What would be the output of embedding layer? It would be nice if someone could explain it with some examples maybe!

推荐答案

嵌入层会像word2vec或预先计算的手套一样,从输入的单词中创建嵌入矢量(我自己仍然不懂数学).

Embedding layer creates embedding vectors out of the input words (I myself still don't understand the math) similarly like word2vec or precalculated glove would do.

在获得您的代码之前,让我们举一个简短的例子.

Before I get to your code, let's make a short example.

texts = ['This is a text','This is not a text']

首先,我们将这些句子转换为整数向量,其中每个词都是字典中分配给该词的数字,向量的顺序创建了这些词的序列.

First we turn these sentences into the vector of integers where each word is a number assigned to the word in the dictionary and order of the vector creates the sequence of the words.

from keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences 
from keras.utils import to_categorical

max_review_length = 6 #maximum length of the sentence
embedding_vecor_length = 3
top_words = 10

#num_words is tne number of unique words in the sequence, if there's more top count words are taken
tokenizer = Tokenizer(top_words)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
word_index = tokenizer.word_index
input_dim = len(word_index) + 1
print('Found %s unique tokens.' % len(word_index))

#max_review_length is the maximum length of the input text so that we can create vector [... 0,0,1,3,50] where 1,3,50 are individual words
data = pad_sequences(sequences, max_review_length)

print('Shape of data tensor:', data.shape)
print(data)

[Out:] 
'This is a text' --> [0 0 1 2 3 4]
'This is not a text' --> [0 1 2 5 3 4]

现在您可以将其输入到嵌入层

Now you can input these into the embedding layer

from keras.models import Sequential
from keras.layers import Embedding

model = Sequential()
model.add(Embedding(top_words, embedding_vecor_length, input_length=max_review_length,mask_zero=True))
model.compile(optimizer='adam', loss='categorical_crossentropy')
output_array = model.predict(data)

output_array包含大小为(2、6、3)的数组:在我的情况下为2个输入评论或句子,6个是每个评论中的最大单词数(max_review_length),3个是embedding_vecor_length. 例如

output_array contains array of size (2, 6, 3): 2 input reviews or sentences in my case, 6 is the maximum number of words in each review (max_review_length) and 3 is embedding_vecor_length. E.g.

array([[[-0.01494285, -0.007915  ,  0.01764857],
    [-0.01494285, -0.007915  ,  0.01764857],
    [-0.03019481, -0.02910612,  0.03518577],
    [-0.0046863 ,  0.04763055, -0.02629668],
    [ 0.02297204,  0.02146662,  0.03114786],
    [ 0.01634104,  0.02296363, -0.02348827]],

   [[-0.01494285, -0.007915  ,  0.01764857],
    [-0.03019481, -0.02910612,  0.03518577],
    [-0.0046863 ,  0.04763055, -0.02629668],
    [-0.01736645, -0.03719328,  0.02757809],
    [ 0.02297204,  0.02146662,  0.03114786],
    [ 0.01634104,  0.02296363, -0.02348827]]], dtype=float32)

在您的情况下,您有5000个单词的列表,最多可以创建500个单词的复习(将修剪掉更多单词),并将这500个单词中的每个单词转换为大小为32的向量.

In your case you have a list of 5000 words, which can create review of maximum 500 words (more will be trimmed) and turn each of these 500 words into vector of size 32.

您可以通过运行以下命令来获取单词索引和嵌入向量之间的映射:

You can get mapping between the word indexes and embedding vectors by running:

model.layers[0].get_weights()

在top_words下面是10个的情况下,因此我们有10个单词的映射,您可以看到0、1、2、3、4和5的映射等于上面的output_array.

In the case below top_words was 10, so we have mapping of 10 words and you can see that mapping for 0, 1, 2, 3, 4 and 5 is equal to output_array above.

[array([[-0.01494285, -0.007915  ,  0.01764857],
    [-0.03019481, -0.02910612,  0.03518577],
    [-0.0046863 ,  0.04763055, -0.02629668],
    [ 0.02297204,  0.02146662,  0.03114786],
    [ 0.01634104,  0.02296363, -0.02348827],
    [-0.01736645, -0.03719328,  0.02757809],
    [ 0.0100757 , -0.03956784,  0.03794377],
    [-0.02672029, -0.00879055, -0.039394  ],
    [-0.00949502, -0.02805768, -0.04179233],
    [ 0.0180716 ,  0.03622523,  0.02232374]], dtype=float32)]

https://stats.stackexchange.com/questions/中所述270546/keras-keras-embedding-layer-work 这些向量是随机发起的,并由网络优化器像网络的任何其他参数一样进行优化.

As mentioned in https://stats.stackexchange.com/questions/270546/how-does-keras-embedding-layer-work these vectors are initiated as random and optimized by the netword optimizers just like any other parameter of the network.

这篇关于举例说明:在keras中嵌入图层的工作方式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆