无法使用MultiRNNCell和dynamic_rnn堆叠LSTM [英] Cannot stack LSTM with MultiRNNCell and dynamic_rnn

查看:288
本文介绍了无法使用MultiRNNCell和dynamic_rnn堆叠LSTM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试建立多元时间序列预测模型.我按照以下教程进行温度预测. http://nbviewer.jupyter.org/github/addfor /tutorials/blob/master/machine_learning/ml16v04_forecasting_with_LSTM.ipynb

I am trying to build a multivariate time series prediction model. I followed the following tutorial for temperature prediction. http://nbviewer.jupyter.org/github/addfor/tutorials/blob/master/machine_learning/ml16v04_forecasting_with_LSTM.ipynb

我想通过使用以下代码将他的模型扩展到多层LSTM模型:

I want to extend his model to multilayer LSTM model by using following code:

cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)  
cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True)  
output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32)

但是我有一个错误说:

ValueError:尺寸必须相等,但对于256和142 'rnn/while/rnn/multi_rnn_cell/cell_0/cell_0/lstm_cell/MatMul_1'(op: 输入形状为[?,256],[142,512]的"MatMul").

ValueError: Dimensions must be equal, but are 256 and 142 for 'rnn/while/rnn/multi_rnn_cell/cell_0/cell_0/lstm_cell/MatMul_1' (op: 'MatMul') with input shapes: [?,256], [142,512].

当我尝试这样做时:

cell = []
for i in range(num_layers):
    cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True))
cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True)
output, _ = tf.nn.dynamic_rnn(cell=cell, inputs=features, dtype=tf.float32)

我没有这样的错误,但是预测确实很糟糕.

I do not have such error but the prediction is really bad.

我定义hidden=128.

features = tf.reshape(features, [-1, n_steps, n_input])具有单层情况的形状(?,1,14).

features = tf.reshape(features, [-1, n_steps, n_input]) has shape (?,1,14) for single layer case.

我的数据看起来像这样x.shape=(594,14), y.shape=(591,1)

my data look like this x.shape=(594,14), y.shape=(591,1)

我很困惑如何在张量流中堆叠LSTM细胞.我的tensorflow版本是0.14.

I am so confused how to stack LSTM cell in tensorflow. My tensorflow version is 0.14.

推荐答案

这是一个非常有趣的问题.最初,我认为两个代码产生相同的输出(堆叠两个 LSTM单元).

This is a very interesting question. Initially, I thought that two codes produce the same output (i.e stacking two LSTM cells).

代码1

cell = tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True)  
cell = tf.contrib.rnn.MultiRNNCell([cell] * num_layers,state_is_tuple=True)
print(cell) 

代码2

cell = []
for i in range(num_layers):
    cell.append(tf.contrib.rnn.LSTMCell(hidden, state_is_tuple=True))
cell = tf.contrib.rnn.MultiRNNCell(cell,state_is_tuple=True)
print(cell) 

但是,如果在两种情况下都打印单元格,则会产生如下所示的情况,

However, If you print the cell in both instances produce something like following,

代码1

[<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>]

代码2

[<tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D7084E0>, <tensorflow.python.ops.rnn_cell_impl.BasicLSTMCell object at 0x000000000D708B00>]

如果您仔细观察结果,

  • 对于代码1 ,打印两个 LSTM单元对象和一个对象的列表 是其他副本(因为两个对象的指针相同)
  • 对于代码2 ,打印两个不同的 LSTM单元对象的列表(因为两个对象的指针不同).
  • For code 1, prints a list of two LSTM cell objects and one object is the copy of other (since the pointers of the two objects are same)
  • For code 2 prints a list of two different LSTM cell objects (since the pointers of two objects are different).

堆叠两个 LSTM单元格,如下所示,

因此,如果您考虑全局(实际的Tensorflow操作可能有所不同),它的作用就是

Therefore, If you think about the big picture (actual Tensorflow operation may be different), what it does is,

  1. 第一个映射输入 LSTM单元1 隐藏的单位(在您的情况下, 14 128 ).
  2. 第二,将 LSTM单元格1 的隐藏单元映射到 LSTM单元格2 的隐藏单元(在您的情况下 128 128 ).
  1. First map inputs to LSTM cell 1 hidden units (in your case 14 to 128).
  2. Second, map hidden units of LSTM cell 1 to hidden units of LSTM cell 2 (in your case 128 to 128) .

因此,当您尝试对同一 LSTM单元格的副本执行上述两项操作时(由于权重矩阵的尺寸不同),会出现错误.

Therefore, when you trying to do the above two operations to the same copy of LSTM cell (since the dimensions of weight matrices are different), there is an error.

但是,如果您将隐藏单位的数量与输入单位的数量相同(在您的情况下,输入为 14 ,则隐藏为 14 ),尽管您使用的是同一 LSTM单元格,但没有错误(因为权重矩阵的尺寸相同).

However, if you use the number of hidden units as same the number input units (in your case input is 14 and hidden is 14) there is no error (since the dimensions of weight matrices are the same) although you are using the same LSTM cell.

因此,如果您要堆叠两个 LSTM单元格,我认为您的第二种方法是正确的.

Therefore, I think your second approach is correct if you are thinking of stacking two LSTM cells.

这篇关于无法使用MultiRNNCell和dynamic_rnn堆叠LSTM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆