RNN 是否会为后续的小批量重置初始状态? [英] Is RNN initial state reset for subsequent mini-batches?

查看:19
本文介绍了RNN 是否会为后续的小批量重置初始状态?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以澄清一下是否为后续的小批量重置了 TF 中 RNN 的初始状态,还是使用前一个小批量的最后状态,如 Ilya Sutskever 等人,ICLR 2015 ?

Could someone please clarify whether the initial state of the RNN in TF is reset for subsequent mini-batches, or the last state of the previous mini-batch is used as mentioned in Ilya Sutskever et al., ICLR 2015 ?

推荐答案

tf.nn.dynamic_rnn()tf.nn.rnn() 操作允许使用 initial_state 参数指定 RNN 的初始状态.如果不指定此参数,隐藏状态将在每个训练批次开始时初始化为零向量.

The tf.nn.dynamic_rnn() or tf.nn.rnn() operations allow to specify the initial state of the RNN using the initial_state parameter. If you don't specify this parameter, the hidden states will be initialized to zero vectors at the beginning of each training batch.

在 TensorFlow 中,您可以将张量包装在 tf.Variable() 中,以在多个会话运行之间将它们的值保存在图中.只需确保将它们标记为不可训练,因为优化器默认会调整所有可训练变量.

In TensorFlow, you can wrap tensors in tf.Variable() to keep their values in the graph between multiple session runs. Just make sure to mark them as non-trainable because the optimizers tune all trainable variables by default.

data = tf.placeholder(tf.float32, (batch_size, max_length, frame_size))

cell = tf.nn.rnn_cell.GRUCell(256)
state = tf.Variable(cell.zero_states(batch_size, tf.float32), trainable=False)
output, new_state = tf.nn.dynamic_rnn(cell, data, initial_state=state)

with tf.control_dependencies([state.assign(new_state)]):
    output = tf.identity(output)

sess = tf.Session()
sess.run(tf.initialize_all_variables())
sess.run(output, {data: ...})

我没有测试过这段代码,但它应该会给你一个正确方向的提示.还有一个 tf.nn.state_saving_rnn() 你可以向它提供一个状态保护对象,但我还没有使用它.

I haven't tested this code but it should give you a hint in the right direction. There is also a tf.nn.state_saving_rnn() to which you can provide a state saver object, but I didn't use it yet.

这篇关于RNN 是否会为后续的小批量重置初始状态?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆