TensorFlow:记住下一批的LSTM状态(有状态LSTM) [英] TensorFlow: Remember LSTM state for next batch (stateful LSTM)

查看:474
本文介绍了TensorFlow:记住下一批的LSTM状态(有状态LSTM)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出一个训练有素的LSTM模型,我想对单个时间步执行推断,即下面的示例中的seq_length = 1.在每个时间步之后,需要记住下一个批"的内部LSTM状态(内存和隐藏状态).对于推论的开始,在给定输入的情况下,将计算内部LSTM状态init_c, init_h.然后将它们存储在LSTMStateTuple对象中,该对象传递给LSTM.在训练期间,此状态会在每个时间步更新.但是为了进行推断,我希望将state保存在两个批次之间,即初始状态仅需要在最开始就进行计算,然后应在每个``批''(n = 1)之后保存LSTM状态.

Given a trained LSTM model I want to perform inference for single timesteps, i.e. seq_length = 1 in the example below. After each timestep the internal LSTM (memory and hidden) states need to be remembered for the next 'batch'. For the very beginning of the inference the internal LSTM states init_c, init_h are computed given the input. These are then stored in a LSTMStateTuple object which is passed to the LSTM. During training this state is updated every timestep. However for inference I want the state to be saved in between batches, i.e. the initial states only need to be computed at the very beginning and after that the LSTM states should be saved after each 'batch' (n=1).

我发现了这个与StackOverflow相关的问题: Tensorflow,这是保存的最佳方法RNN中的状态?.但是,这仅在state_is_tuple=False下有效,但TensorFlow很快将弃用此行为(请参见 https://github.com/tensorflow/tensorflow/issues/2838

I found this related StackOverflow question: Tensorflow, best way to save state in RNNs?. However this only works if state_is_tuple=False, but this behavior is soon to be deprecated by TensorFlow (see rnn_cell.py). Keras seems to have a nice wrapper to make stateful LSTMs possible but I don't know the best way to achieve this in TensorFlow. This issue on the TensorFlow GitHub is also related to my question: https://github.com/tensorflow/tensorflow/issues/2838

有人建议建立有状态的LSTM模型吗?

Anyone good suggestions for building a stateful LSTM model?

inputs  = tf.placeholder(tf.float32, shape=[None, seq_length, 84, 84], name="inputs")
targets = tf.placeholder(tf.float32, shape=[None, seq_length], name="targets")

num_lstm_layers = 2

with tf.variable_scope("LSTM") as scope:

    lstm_cell  = tf.nn.rnn_cell.LSTMCell(512, initializer=initializer, state_is_tuple=True)
    self.lstm  = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_lstm_layers, state_is_tuple=True)

    init_c = # compute initial LSTM memory state using contents in placeholder 'inputs'
    init_h = # compute initial LSTM hidden state using contents in placeholder 'inputs'
    self.state = [tf.nn.rnn_cell.LSTMStateTuple(init_c, init_h)] * num_lstm_layers

    outputs = []

    for step in range(seq_length):

        if step != 0:
            scope.reuse_variables()

        # CNN features, as input for LSTM
        x_t = # ... 

        # LSTM step through time
        output, self.state = self.lstm(x_t, self.state)
        outputs.append(output)

推荐答案

我发现在占位符中保存所有图层的整个状态是最容易的.

I found out it was easiest to save the whole state for all layers in a placeholder.

init_state = np.zeros((num_layers, 2, batch_size, state_size))

...

state_placeholder = tf.placeholder(tf.float32, [num_layers, 2, batch_size, state_size])

然后解压缩它并创建一个LSTMStateTuples元组,然后再使用本机tensorflow RNN Api.

Then unpack it and create a tuple of LSTMStateTuples before using the native tensorflow RNN Api.

l = tf.unpack(state_placeholder, axis=0)
rnn_tuple_state = tuple(
[tf.nn.rnn_cell.LSTMStateTuple(l[idx][0], l[idx][1])
 for idx in range(num_layers)]
)

RNN传递了API:

RNN passes in the API:

cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.MultiRNNCell([cell]*num_layers, state_is_tuple=True)
outputs, state = tf.nn.dynamic_rnn(cell, x_input_batch, initial_state=rnn_tuple_state)

state-变量将作为占位符输入到下一批.

The state - variable will then be feeded to the next batch as a placeholder.

这篇关于TensorFlow:记住下一批的LSTM状态(有状态LSTM)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆