为什么要为tf.keras.layers.LSTM设置return_sequences = True和stateful = True? [英] why set return_sequences=True and stateful=True for tf.keras.layers.LSTM?

查看:884
本文介绍了为什么要为tf.keras.layers.LSTM设置return_sequences = True和stateful = True?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在学习tensorflow2.0,并遵循教程.在rnn示例中,我找到了代码:

I am learning tensorflow2.0 and follow the tutorial. In the rnn example, I found the code:

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, 
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.LSTM(rnn_units, 
                        return_sequences=True, 
                        stateful=True, 
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

我的问题是:为什么代码设置了参数return_sequences=Truestateful=True?如何使用默认参数?

My question is: why the code set the argument return_sequences=True and stateful=True? How about using the default argument?

推荐答案

教程中的示例与文本生成有关.这是批量输入到网络的输入:

The example in the tutorial is about text generation. This is the input that is fed to the network in a batch:

(64,100,65)#(batch_size,sequence_length,vocab_size)

(64, 100, 65) # (batch_size, sequence_length, vocab_size)

  1. return_sequences=True

由于打算在每个时间步长(即序列中的每个字符)预测一个字符,因此需要预测下一个字符.

Since the intention is to predict a character for every time step i.e. for every character in the sequence, the next character needs to be predicted.

因此,参数return_sequences=True设置为true,以获得(64,100,65)的输出形状.如果将此参数设置为False,则仅返回最后一个输出,因此对于64批,输出将为(64,65),即对于每100个字符的序列,仅返回最后的预测字符.

So, the argument return_sequences=True is set to true, to get an output shape of (64, 100, 65). If this argument is set to False, then only the last output would be returned, so for batch of 64, output would be (64, 65) i.e. for every sequence of 100 characters, only the last predicted character would be returned.

  1. stateful=True

从文档中, 如果为True,则将使用批次中索引i的每个样本的最后状态作为下一个批次中索引i的样本的初始状态."

在该教程的下图中,您可以看到设置有状态有助于LSTM通过提供先前预测的上下文来做出更好的预测.

In the below diagram from the tutorial, you can see that setting stateful helps the LSTM make better predictions by providing the context of the previous prediction.

这篇关于为什么要为tf.keras.layers.LSTM设置return_sequences = True和stateful = True?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆