如何在Tensorflow 2.0 RNN中使用预训练的嵌入矩阵作为嵌入层中的初始权重? [英] How to use a pre-trained embedding matrix in tensorflow 2.0 RNN as initial weights in an embedding layer?

查看：823 发布时间：2020/5/18 0:49:25 tensorflow nlp recurrent-neural-network embedding glove

本文介绍了如何在Tensorflow 2.0 RNN中使用预训练的嵌入矩阵作为嵌入层中的初始权重?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想使用预训练的GloVe嵌入作为RNN编码器/解码器中嵌入层的初始权重.该代码在Tensorflow 2.0中.只需将嵌入矩阵作为权重= [embedding_matrix]参数添加到tf.keras.layers.Embedding层就不会这样做，因为编码器是一个对象，我现在不确定现在将embedding_matrix有效地传递给此对象训练时间.

I'd like to use a pretrained GloVe embedding as the initial weights for an embedding layer in an RNN encoder/decoder. The code is in Tensorflow 2.0. Simply adding the embedding matrix as a weights = [embedding_matrix] parameter to the tf.keras.layers.Embedding layer won't do it because the encoder is an object and I'm not sure now to effectively pass the embedding_matrix to this object at training time.

我的代码紧紧遵循Tensorflow 2.0文档中的神经机器翻译示例.在此示例中，如何将预训练的嵌入矩阵添加到编码器?编码器是一个对象.当我开始培训时，Tensorflow图无法使用GloVe嵌入矩阵.我收到错误消息:

My code closely follows the neural machine translation example in the Tensorflow 2.0 documentation. How would I add a pre-trained embedding matrix to the encoder in this example? The encoder is an object. When I get to training, the GloVe embedding matrix is unavailable to the Tensorflow graph. I get the error message:

RuntimeError:无法在Tensorflow图函数中获取值.

RuntimeError: Cannot get value inside Tensorflow graph function.

代码在训练过程中使用GradientTape方法和教师强制.

The code uses the GradientTape method and teacher forcing in the training process.

我尝试修改编码器对象，以在各个点上包括embedding_matrix，包括在编码器的 init ，call和initialize_hidden_state中.所有这些都失败了.关于stackoverflow和其他地方的其他问题是针对Keras或更旧版本的Tensorflow，而不是Tensorflow 2.0.

I've tried modifying the encoder object to include the embedding_matrix at various points, including in the encoder's init, call and initialize_hidden_state. All of these fail. The other questions on stackoverflow and elsewhere are for Keras or older versions of Tensorflow, not Tensorflow 2.0.

class Encoder(tf.keras.Model):
    def __init__(self, vocab_size, embedding_dim, enc_units, batch_sz):
        super(Encoder, self).__init__()
        self.batch_sz = batch_sz
        self.enc_units = enc_units
        self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim, weights=[embedding_matrix])
        self.gru = tf.keras.layers.GRU(self.enc_units,
                                       return_sequences=True,
                                       return_state=True,
                                       recurrent_initializer='glorot_uniform')

    def call(self, x, hidden):
        x = self.embedding(x)
        output, state = self.gru(x, initial_state = hidden)
        return output, state

    def initialize_hidden_state(self):
        return tf.zeros((self.batch_sz, self.enc_units))

encoder = Encoder(vocab_inp_size, embedding_dim, units, BATCH_SIZE)

# sample input
sample_hidden = encoder.initialize_hidden_state()
sample_output, sample_hidden = encoder(example_input_batch, sample_hidden)
print ('Encoder output shape: (batch size, sequence length, units) {}'.format(sample_output.shape))
print ('Encoder Hidden state shape: (batch size, units) {}'.format(sample_hidden.shape))

# ... Bahdanau Attention, Decoder layers, and train_step defined, see link to full tensorflow code above ...

# Relevant training code

EPOCHS = 10

training_record = pd.DataFrame(columns = ['epoch', 'training_loss', 'validation_loss', 'epoch_time'])


for epoch in range(EPOCHS):
    template = 'Epoch {}/{}'
    print(template.format(epoch +1,
                 EPOCHS))
    start = time.time()

    enc_hidden = encoder.initialize_hidden_state()
    total_loss = 0
    total_val_loss = 0

    for (batch, (inp, targ)) in enumerate(dataset.take(steps_per_epoch)):
        batch_loss = train_step(inp, targ, enc_hidden)
        total_loss += batch_loss

        if batch % 100 == 0:
            template = 'batch {} ============== train_loss: {}'
            print(template.format(batch +1,
                            round(batch_loss.numpy(),4)))

如何在Tensorflow 2.0 RNN中使用预训练的嵌入矩阵作为嵌入层中的初始权重? [英] How to use a pre-trained embedding matrix in tensorflow 2.0 RNN as initial weights in an embedding layer?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在Tensorflow 2.0 RNN中使用预训练的嵌入矩阵作为嵌入层中的初始权重? [英] How to use a pre-trained embedding matrix in tensorflow 2.0 RNN as initial weights in an embedding layer?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭