TensorFlow:记住下一批的 LSTM 状态(有状态 LSTM) [英] TensorFlow: Remember LSTM state for next batch (stateful LSTM)

查看:26
本文介绍了TensorFlow:记住下一批的 LSTM 状态(有状态 LSTM)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个经过训练的 LSTM 模型,我想对单个时间步长进行推理,即下面示例中的 seq_length = 1.在每个时间步之后,需要记住下一个批次"的内部 LSTM(记忆和隐藏)状态.对于推理的最开始,给定输入计算内部 LSTM 状态 init_c, init_h.然后将它们存储在传递给 LSTM 的 LSTMStateTuple 对象中.在训练期间,每个时间步都会更新此状态.然而,对于推理,我希望 state 保存在批次之间,即初始状态只需要在最开始计算,之后 LSTM 状态应该在每个批次"(n=1).

Given a trained LSTM model I want to perform inference for single timesteps, i.e. seq_length = 1 in the example below. After each timestep the internal LSTM (memory and hidden) states need to be remembered for the next 'batch'. For the very beginning of the inference the internal LSTM states init_c, init_h are computed given the input. These are then stored in a LSTMStateTuple object which is passed to the LSTM. During training this state is updated every timestep. However for inference I want the state to be saved in between batches, i.e. the initial states only need to be computed at the very beginning and after that the LSTM states should be saved after each 'batch' (n=1).

我发现了这个相关的 StackOverflow 问题:Tensorflow, best way to saveRNN 中的状态?.然而,这只适用于 state_is_tuple=False,但这种行为很快就会被 TensorFlow 弃用(参见 rnn_cell.py).Keras 似乎有一个很好的包装器来使 有状态 LSTM 成为可能,但我不知道在 TensorFlow 中实现这一目标的最佳方法.TensorFlow GitHub 上的这个问题也与我的问题有关:https://github.com/tensorflow/张量流/问题/2838

I found this related StackOverflow question: Tensorflow, best way to save state in RNNs?. However this only works if state_is_tuple=False, but this behavior is soon to be deprecated by TensorFlow (see rnn_cell.py). Keras seems to have a nice wrapper to make stateful LSTMs possible but I don't know the best way to achieve this in TensorFlow. This issue on the TensorFlow GitHub is also related to my question: https://github.com/tensorflow/tensorflow/issues/2838

有没有关于构建有状态 LSTM 模型的好建议?

Anyone good suggestions for building a stateful LSTM model?

inputs  = tf.placeholder(tf.float32, shape=[None, seq_length, 84, 84], name="inputs")
targets = tf.placeholder(tf.float32, shape=[None, seq_length], name="targets")

num_lstm_layers = 2

with tf.variable_scope("LSTM") as scope:

    lstm_cell  = tf.nn.rnn_cell.LSTMCell(512, initializer=initializer, state_is_tuple=True)
    self.lstm  = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_lstm_layers, state_is_tuple=True)

    init_c = # compute initial LSTM memory state using contents in placeholder 'inputs'
    init_h = # compute initial LSTM hidden state using contents in placeholder 'inputs'
    self.state = [tf.nn.rnn_cell.LSTMStateTuple(init_c, init_h)] * num_lstm_layers

    outputs = []

    for step in range(seq_length):

        if step != 0:
            scope.reuse_variables()

        # CNN features, as input for LSTM
        x_t = # ... 

        # LSTM step through time
        output, self.state = self.lstm(x_t, self.state)
        outputs.append(output)

推荐答案

我发现将所有图层的整个状态保存在占位符中是最简单的.

I found out it was easiest to save the whole state for all layers in a placeholder.

init_state = np.zeros((num_layers, 2, batch_size, state_size))

...

state_placeholder = tf.placeholder(tf.float32, [num_layers, 2, batch_size, state_size])

然后在使用原生 tensorflow RNN Api 之前将其解包并创建一个 LSTMStateTuples 元组.

Then unpack it and create a tuple of LSTMStateTuples before using the native tensorflow RNN Api.

l = tf.unpack(state_placeholder, axis=0)
rnn_tuple_state = tuple(
[tf.nn.rnn_cell.LSTMStateTuple(l[idx][0], l[idx][1])
 for idx in range(num_layers)]
)

RNN 传入 API:

RNN passes in the API:

cell = tf.nn.rnn_cell.LSTMCell(state_size, state_is_tuple=True)
cell = tf.nn.rnn_cell.MultiRNNCell([cell]*num_layers, state_is_tuple=True)
outputs, state = tf.nn.dynamic_rnn(cell, x_input_batch, initial_state=rnn_tuple_state)

state - 变量随后将作为占位符提供给下一批.

The state - variable will then be feeded to the next batch as a placeholder.

这篇关于TensorFlow:记住下一批的 LSTM 状态(有状态 LSTM)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆