RNN 初始状态是否为后续小批量重置? [英] Is RNN initial state reset for subsequent mini-batches?
问题描述
有人可以澄清一下 TF 中 RNN 的初始状态是为后续小批量重置,还是如 Ilya Sutskever 等人,ICLR 2015 ?
Could someone please clarify whether the initial state of the RNN in TF is reset for subsequent mini-batches, or the last state of the previous mini-batch is used as mentioned in Ilya Sutskever et al., ICLR 2015 ?
推荐答案
tf.nn.dynamic_rnn()
或 tf.nn.rnn()
操作允许使用 initial_state
参数指定 RNN 的初始状态.如果不指定此参数,隐藏状态将在每个训练批次开始时初始化为零向量.
The tf.nn.dynamic_rnn()
or tf.nn.rnn()
operations allow to specify the initial state of the RNN using the initial_state
parameter. If you don't specify this parameter, the hidden states will be initialized to zero vectors at the beginning of each training batch.
在 TensorFlow 中,您可以将张量包装在 tf.Variable()
中,以在多次会话运行之间将它们的值保留在图中.只需确保将它们标记为不可训练,因为默认情况下优化器会调整所有可训练的变量.
In TensorFlow, you can wrap tensors in tf.Variable()
to keep their values in the graph between multiple session runs. Just make sure to mark them as non-trainable because the optimizers tune all trainable variables by default.
data = tf.placeholder(tf.float32, (batch_size, max_length, frame_size))
cell = tf.nn.rnn_cell.GRUCell(256)
state = tf.Variable(cell.zero_states(batch_size, tf.float32), trainable=False)
output, new_state = tf.nn.dynamic_rnn(cell, data, initial_state=state)
with tf.control_dependencies([state.assign(new_state)]):
output = tf.identity(output)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
sess.run(output, {data: ...})
我还没有测试过这段代码,但它应该会给你一个正确方向的提示.还有一个 tf.nn.state_saving_rnn()
您可以向其提供状态保护程序对象,但我还没有使用它.
I haven't tested this code but it should give you a hint in the right direction. There is also a tf.nn.state_saving_rnn()
to which you can provide a state saver object, but I didn't use it yet.
这篇关于RNN 初始状态是否为后续小批量重置?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!