Keras:重塑以连接LSTM和Conv [英] Keras: reshape to connect lstm and conv

查看:339
本文介绍了Keras:重塑以连接LSTM和Conv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题也以 github问题的形式存在. 我想在Keras中建立一个包含2D卷积和LSTM层的神经网络.

This question exists as a github issue , too. I would like to build a neural network in Keras which contains both 2D convolutions and an LSTM layer.

网络应将MNIST分类. MNIST中的训练数据是60000个手写数字从0到9的灰度图像.每个图像为28x28像素.

The network should classify MNIST. The training data in MNIST are 60000 grey-scale images of handwritten digits from 0 to 9. Each image is 28x28 pixels.

我已将图像分为四个部分(左/右,上/下),并按四个顺序重新排列以获得LSTM的序列.

I've splitted the images into four parts (left/right, up/down) and rearranged them in four orders to get sequences for the LSTM.

|     |      |1 | 2|
|image|  ->  -------   -> 4 sequences: |1|2|3|4|,  |4|3|2|1|, |1|3|2|4|, |4|2|3|1|
|     |      |3 | 4|

其中一个小子图像的尺寸为14 x14.这四个序列沿宽度方向堆叠在一起(宽度与高度无关).

One of the small sub-images has the dimension 14 x 14. The four sequences are stacked together along the width (shouldn't matter whether width or height).

这将创建一个形状为[60000,4,1,56,14,14]的向量,其中:

This creates a vector with the shape [60000, 4, 1, 56, 14] where:

  • 60000是样本数量
  • 4是序列中的元素数(时间步数)
  • 1是颜色的深度(灰度)
  • 56和14是宽度和高度

现在应该将其提供给Keras模型. 问题是要更改CNN和LSTM之间的输入尺寸. 我在网上搜索并发现了以下问题:

Now this should be given to a Keras model. The problem is to change the input dimensions between the CNN and the LSTM. I searched online and found this question: Python keras how to change the size of input after convolution layer into lstm layer

解决方案似乎是使图像变平但保留时间步长的Reshape层(与Flatten层相反,该层会使除batch_size以外的所有内容都塌陷).

The solution seems to be a Reshape layer which flattens the image but retains the timesteps (as opposed to a Flatten layer which would collapse everything but the batch_size).

到目前为止,这是我的代码:

Here's my code so far:

nb_filters=32
kernel_size=(3,3)
pool_size=(2,2)
nb_classes=10
batch_size=64

model=Sequential()

model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
    border_mode="valid", input_shape=[1,56,14]))
model.add(Activation("relu"))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=pool_size))


model.add(Reshape((56*14,)))
model.add(Dropout(0.25))
model.add(LSTM(5))
model.add(Dense(50))
model.add(Dense(nb_classes))
model.add(Activation("softmax"))

此代码创建一条错误消息:

This code creates an error message:

ValueError:新数组的总大小必须保持不变

ValueError: total size of new array must be unchanged

显然,重塑"层的输入不正确.作为替代方案,我也尝试将时间步长传递给Reshape层:

Apparently the input to the Reshape layer is incorrect. As an alternative, I tried to pass the timesteps to the Reshape layer, too:

model.add(Reshape((4,56*14)))

这感觉不对,在任何情况下,错误都保持不变.

This doesn't feel right and in any case, the error stays the same.

我这样做正确吗? 重塑层是连接CNN和LSTM的合适工具吗?

Am I doing this the right way ? Is a Reshape layer the proper tool to connect CNN and LSTM ?

有相当复杂的方法可以解决此问题. 像这样: https://github.com/fchollet/keras/pull/1456 一个TimeDistributed层,似乎在随后的层中隐藏了时间步维度.

There are rather complex approaches to this problem. Such as this: https://github.com/fchollet/keras/pull/1456 A TimeDistributed Layer which seems to hide the timestep dimension from following layers.

或者这样: https://github.com/anayebi/keras-extra 一组用于组合CNN和LSTM的特殊层.

Or this: https://github.com/anayebi/keras-extra A set of special layers for combining CNNs and LSTMs.

如果有一个简单的Reshape可以解决问题,为什么会有这么复杂的解决方案(至少对我来说它们看起来很复杂)?

Why are there so complicated (at least they seem complicated to me) solutions, if a simple Reshape does the trick ?

更新:

令人尴尬的是,我忘记了合并和(由于缺乏填充)卷积也会改变尺寸. kgrm 建议我使用model.summary()来检查尺寸.

Embarrassingly, I forgot that the dimensions will be changed by the pooling and (for lack of padding) the convolutions, too. kgrm advised me to use model.summary() to check the dimensions.

重塑"层之前的层的输出为(None, 32, 26, 5), 我将重塑更改为:model.add(Reshape((32*26*5,))).

The output of the layer before the Reshape layer is (None, 32, 26, 5), I changed the reshape to: model.add(Reshape((32*26*5,))).

现在ValueError消失了,取而代之的是LSTM抱怨:

Now the ValueError is gone, instead the LSTM complains:

异常:输入0与lstm_5层不兼容:预期ndim = 3,发现ndim = 2

Exception: Input 0 is incompatible with layer lstm_5: expected ndim=3, found ndim=2

似乎我需要将时间步维度传递给整个网络.我怎样才能做到这一点 ?如果我将其添加到卷积的input_shape中,它也会抱怨:Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode="valid", input_shape=[4, 1, 56,14])

It seems like I need to pass the timestep dimension through the entire network. How can I do that ? If I add it to the input_shape of the Convolution, it complains, too: Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode="valid", input_shape=[4, 1, 56,14])

异常:输入0与层卷积不兼容2d_44:预期ndim = 4,发现ndim = 5

Exception: Input 0 is incompatible with layer convolution2d_44: expected ndim=4, found ndim=5

推荐答案

根据 Convolution2D 定义您的输入必须是4维且尺寸为(samples, channels, rows, cols).这就是为什么您会出错的直接原因.

According to Convolution2D definition your input must be 4-dimensional with dimensions (samples, channels, rows, cols). This is the direct reason why are you getting an error.

要解决此问题,您必须使用 TimeDistributed 包装器.这样一来,您就可以在整个时间段内使用静态(非循环)图层.

To resolve that you must use TimeDistributed wrapper. This allows you to use static (not recurrent) layers across the time.

这篇关于Keras:重塑以连接LSTM和Conv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆