Keras使用预训练的嵌入来初始化大嵌入层 [英] Keras initialize large embeddings layer with pretrained embeddings

查看:152
本文介绍了Keras使用预训练的嵌入来初始化大嵌入层的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过使用预训练的嵌入和自定义语料库在带有Tensorflow后端的Keras 2中重新训练word2vec模型.

I am trying to re-train a word2vec model in Keras 2 with Tensorflow backend by using pretrained embeddings and custom corpus.

这是我如何使用预训练的嵌入初始化嵌入层的方法:

This is how I initialize the embeddings layer with pretrained embeddings:

embedding = Embedding(vocab_size, embedding_dim,
                      input_length=1, name='embedding',
                      embeddings_initializer=lambda x: pretrained_embeddings)

其中pretrained_embeddings是大小为vocab_size x embedding_dim

只要pretrained_embeddings不太大,此方法就起作用.

This works as long as pretrained_embeddings is not too big.

不幸的是,不是我的情况-vocab_size=2270872embedding_dim=300.

In my case unfortunately this is not the case - vocab_size=2270872 and embedding_dim=300.

初始化Embeddings层时,出现错误:

Upon initializing the Embeddings layer I get the error:

Cannot create a tensor proto whose content is larger than 2GB.

错误来自于函数add_weight() /opt/r/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py,更具体地说是以下行:

The error comes from the function add_weight() in /opt/r/anaconda3/lib/python3.6/site-packages/keras/engine/base_layer.py, more specifically the following line:

weight = K.variable(initializer(shape),
                    dtype=dtype,
                    name=name,
                    constraint=constraint)

initializer是上面的lambda函数,它返回大矩阵.如前所述,shape(2270872, 300).

initializer is the lambda function from above, which returns the big matrix. shape is (2270872, 300) as already mentioned.

是否可以解决这一问题而不必进行低级Tensorflow编程?如果我将Theano用作后端,则代码运行良好,但我希望使用Tensorflow以获得更好的长期前景.

Is it possible to solve this issue without having to go to low-level Tensorflow programming ? If I switch to Theano as a backend the code runs fine, but I'd like to use Tensorflow for its better long-term prospects.

我发现的唯一类似的Stackoverflow问题是,它提出了占位符变量,但是我不确定如何在Keras级别上应用它们.

The only similar Stackoverflow question I found was this, which proposes placeholder variables, but I am not sure how I can apply them on the level of Keras.

非常感谢

修改: 我非常愿意在Tensorflow后端的级别上解决此问题.只是我不知道在这种情况下如何在同一应用程序中组合Tensorflow和Keras代码.大多数示例是一个或另一个,而不是两者.

I am more than willing to work around this issue on the level of the Tensorflow backend. It's just that I don't know how to combine in this case Tensorflow and Keras code in the same application. Most examples are either one or the other, not both.

例如,当Keras中的Embeddings层初始化不可避免地调用add_weight()函数时,Tensorflow占位符变量有什么用?这会导致问题?

For example, what use are the Tensorflow placeholder variables when the initialization of the Embeddings layer in Keras will inevitably invoke the add_weight() function, which causes the issue ?

解决方案:

正如@ blue-phoenox的评论所暗示的那样,我重新编写了如下代码:

As hinted by in @blue-phoenox's comment I rewrote the code like this:

embedding = Embedding(vocab_size, embedding_dim,
                      input_length=1, 
                      name='embedding')
embedding.build(input_shape=(1,)) # the input_shape here has no effect in the build function
embedding.set_weights([pretrained_embeddings])

做到了.再次感谢@ blue-phoenox.

That did it. Thanks again @blue-phoenox.

推荐答案

除了使用Embedding层的embeddings_initializer参数,您还可以使用weights参数为嵌入层加载预训练的权重,这样您就可以应该能够移交大于2GB的经过预先训练的嵌入.

Instead of using the embeddings_initializer argument of the Embedding layer you can load pre-trained weights for your embedding layer using the weights argument, this way you should be able to hand over pre-trained embeddings larger than 2GB.

这是一个简短的示例:

from keras.layers import Embedding

embedding_layer = Embedding(vocab_size,
                            EMBEDDING_DIM,
                            weights=[embedding_matrix],
                            input_length=MAX_SEQUENCE_LENGTH,
                            trainable=False)

embedding_matrix只是包含权重的常规numpy矩阵.

Where embedding_matrix is just a regular numpy matrix containing your weights.

例如,您也可以在这里查看:
https://blog.keras.io /using-pre-trained-word-embeddings-in-a-keras-model.html

For for examples you can also take a look here:
https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html

@PavlinMavrodiev (请参阅问题结尾)正确指出了weights参数已被弃用.他改为使用 layer方法 set_weights 进行设置权重代替:

As @PavlinMavrodiev (see end of question) pointed out correctly the weights argument is deprecated. He instead used the layer method set_weights to set the weights instead:

layer.set_weights(weights):从列表中设置图层的权重 的Numpy数组(与get_weights的输出具有相同的形状).

layer.set_weights(weights): sets the weights of the layer from a list of Numpy arrays (with the same shapes as the output of get_weights).

要获得训练后的体重,可以使用get_weights:

To get trained weights get_weights can be used:

layer.get_weights():将图层的权重作为以下项的列表返回 numpy数组.

layer.get_weights(): returns the weights of the layer as a list of Numpy arrays.

这都是Keras Layer-Baseclass中的方法,可用于所有keras层,包括嵌入层.

这篇关于Keras使用预训练的嵌入来初始化大嵌入层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆