TensorFlow:记住下一批的LSTM状态(有状态LSTM) [英] TensorFlow: Remember LSTM state for next batch (stateful LSTM)
问题描述
给出一个训练有素的LSTM模型,我想对单个时间步执行推断,即下面的示例中的seq_length = 1
.在每个时间步之后,需要记住下一个批"的内部LSTM状态(内存和隐藏状态).对于推论的开始,在给定输入的情况下,将计算内部LSTM状态init_c, init_h
.然后将它们存储在LSTMStateTuple
对象中,该对象传递给LSTM.在训练期间,此状态会在每个时间步更新.但是为了进行推断,我希望将state
保存在两个批次之间,即初始状态仅需要在最开始就进行计算,然后应在每个``批''(n = 1)之后保存LSTM状态.
Given a trained LSTM model I want to perform inference for single timesteps, i.e. seq_length = 1
in the example below. After each timestep the internal LSTM (memory and hidden) states need to be remembered for the next 'batch'. For the very beginning of the inference the internal LSTM states init_c, init_h
are computed given the input. These are then stored in a LSTMStateTuple
object which is passed to the LSTM. During training this state is updated every timestep. However for inference I want the state
to be saved in between batches, i.e. the initial states only need to be computed at the very beginning and after that the LSTM states should be saved after each 'batch' (n=1).
我发现了这个与StackOverflow相关的问题: Tensorflow,这是保存的最佳方法RNN中的状态?.但是,这仅在state_is_tuple=False
下有效,但TensorFlow很快将弃用此行为(请参见 https://github.com/tensorflow/tensorflow/issues/2838
I found this related StackOverflow question: Tensorflow, best way to save state in RNNs?. However this only works if state_is_tuple=False
, but this behavior is soon to be deprecated by TensorFlow (see rnn_cell.py). Keras seems to have a nice wrapper to make stateful LSTMs possible but I don't know the best way to achieve this in TensorFlow. This issue on the TensorFlow GitHub is also related to my question: https://github.com/tensorflow/tensorflow/issues/2838
有人建议建立有状态的LSTM模型吗?
Anyone good suggestions for building a stateful LSTM model?
inputs = tf.placeholder(tf.float32, shape=[None, seq_length, 84, 84], name="inputs")
targets = tf.placeholder(tf.float32, shape=[None, seq_length], name="targets")
num_lstm_layers = 2
with tf.variable_scope("LSTM") as scope:
lstm_cell = tf.nn.rnn_cell.LSTMCell(512, initializer=initializer, state_is_tuple=True)
self.lstm = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_lstm_layers, state_is_tuple=True)
init_c = # compute initial LSTM memory state using contents in placeholder 'inputs'
init_h = # compute initial LSTM hidden state using contents in placeholder 'inputs'
self.state = [tf.nn.rnn_cell.LSTMStateTuple(init_c, init_h)] * num_lstm_layers
outputs = []
for step in range(seq_length):
if step != 0:
scope.reuse_variables()
# CNN features, as input for LSTM
x_t = # ...
# LSTM step through time
output, self.state = self.lstm(x_t, self.state)
outputs.append(output)
推荐答案
我发现在占位符中保存所有图层的整个状态是最容易的.
I found out it was easiest to save the whole state for all layers in a placeholder.
init_state = np.zeros((num_layers, 2, batch_size, state_size))
...
state_placeholder = tf.placeholder(tf.float32, [num_layers, 2, batch_size, state_size])
然后解压缩它并创建一个LSTMStateTuples元组,然后再使用本机tensorflow RNN Api.
Then unpack it and create a tuple of LSTMStateTuples before using the native tensorflow RNN Api.
l = tf.unpack(state_placeholder, axis=0)
rnn_tuple_state = tuple(
[tf.nn.rnn_cell.LSTMStateTuple(l[idx][0], l[idx][1])
for idx in range(num_layers)]
)
RNN传递了API:
RNN passes in the API:
cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.MultiRNNCell([cell]*num_layers, state_is_tuple=True)
outputs, state = tf.nn.dynamic_rnn(cell, x_input_batch, initial_state=rnn_tuple_state)
state
-变量将作为占位符输入到下一批.
The state
- variable will then be feeded to the next batch as a placeholder.
这篇关于TensorFlow:记住下一批的LSTM状态(有状态LSTM)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!