Keras:重塑以连接 lstm 和 conv [英] Keras: reshape to connect lstm and conv

查看:32
本文介绍了Keras:重塑以连接 lstm 和 conv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题也作为 github 问题 存在.我想在 Keras 中构建一个包含 2D 卷积和 LSTM 层的神经网络.

This question exists as a github issue , too. I would like to build a neural network in Keras which contains both 2D convolutions and an LSTM layer.

网络应该对 MNIST 进行分类.MNIST 中的训练数据是 60000 张手写数字从 0 到 9 的灰度图像.每张图像为 28x28 像素.

The network should classify MNIST. The training data in MNIST are 60000 grey-scale images of handwritten digits from 0 to 9. Each image is 28x28 pixels.

我已将图像分成四部分(左/右、上/下)并按四个顺序重新排列它们以获得 LSTM 的序列.

I've splitted the images into four parts (left/right, up/down) and rearranged them in four orders to get sequences for the LSTM.

|     |      |1 | 2|
|image|  ->  -------   -> 4 sequences: |1|2|3|4|,  |4|3|2|1|, |1|3|2|4|, |4|2|3|1|
|     |      |3 | 4|

其中一个小子图像的尺寸为 14 x 14.四个序列沿宽度堆叠在一起(无论是宽度还是高度都无关紧要).

One of the small sub-images has the dimension 14 x 14. The four sequences are stacked together along the width (shouldn't matter whether width or height).

这将创建一个形状为 [60000, 4, 1, 56, 14] 的向量,其中:

This creates a vector with the shape [60000, 4, 1, 56, 14] where:

  • 60000 是样本数
  • 4 是序列中元素的数量(时间步数)
  • 1 是颜色的深度(灰度)
  • 56 和 14 是宽度和高度

现在这应该给一个 Keras 模型.问题是改变CNN和LSTM之间的输入维度.我在网上搜索,发现了这个问题:Python keras 如何改变卷积层后输入的大小为lstm层

Now this should be given to a Keras model. The problem is to change the input dimensions between the CNN and the LSTM. I searched online and found this question: Python keras how to change the size of input after convolution layer into lstm layer

解决方案似乎是一个 Reshape 层,它可以展平图像但保留时间步长(而不是 Flatten 层,它会折叠除 batch_size 之外的所有内容).

The solution seems to be a Reshape layer which flattens the image but retains the timesteps (as opposed to a Flatten layer which would collapse everything but the batch_size).

这是我目前的代码:

nb_filters=32
kernel_size=(3,3)
pool_size=(2,2)
nb_classes=10
batch_size=64

model=Sequential()

model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1],
    border_mode="valid", input_shape=[1,56,14]))
model.add(Activation("relu"))
model.add(Convolution2D(nb_filters, kernel_size[0], kernel_size[1]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=pool_size))


model.add(Reshape((56*14,)))
model.add(Dropout(0.25))
model.add(LSTM(5))
model.add(Dense(50))
model.add(Dense(nb_classes))
model.add(Activation("softmax"))

此代码创建一条错误消息:

This code creates an error message:

ValueError: 新数组的总大小必须保持不变

ValueError: total size of new array must be unchanged

显然 Reshape 层的输入不正确.作为替代方案,我也尝试将时间步长传递给 Reshape 层:

Apparently the input to the Reshape layer is incorrect. As an alternative, I tried to pass the timesteps to the Reshape layer, too:

model.add(Reshape((4,56*14)))

这感觉不对,无论如何,错误保持不变.

This doesn't feel right and in any case, the error stays the same.

我这样做对吗?Reshape 层是连接 CNN 和 LSTM 的合适工具吗?

Am I doing this the right way ? Is a Reshape layer the proper tool to connect CNN and LSTM ?

解决这个问题有相当复杂的方法.比如这个:https://github.com/fchollet/keras/pull/1456一个 TimeDistributed 层,它似乎对后续层隐藏了时间步长维度.

There are rather complex approaches to this problem. Such as this: https://github.com/fchollet/keras/pull/1456 A TimeDistributed Layer which seems to hide the timestep dimension from following layers.

或者这个:https://github.com/anayebi/keras-extra一组用于组合 CNN 和 LSTM 的特殊层.

Or this: https://github.com/anayebi/keras-extra A set of special layers for combining CNNs and LSTMs.

如果一个简单的 Reshape 就能解决问题,为什么会有如此复杂(至少对我来说它们看起来很复杂)的解决方案?

Why are there so complicated (at least they seem complicated to me) solutions, if a simple Reshape does the trick ?

更新:

令人尴尬的是,我忘记了池化和(由于缺少填充)卷积也会改变维度.kgrm 建议我使用 model.summary() 检查尺寸.

Embarrassingly, I forgot that the dimensions will be changed by the pooling and (for lack of padding) the convolutions, too. kgrm advised me to use model.summary() to check the dimensions.

Reshape层之前的层的输出为(None, 32, 26, 5),我将重塑更改为:model.add(Reshape((32*26*5,))).

The output of the layer before the Reshape layer is (None, 32, 26, 5), I changed the reshape to: model.add(Reshape((32*26*5,))).

现在 ValueError 消失了,取而代之的是 LSTM 抱怨:

Now the ValueError is gone, instead the LSTM complains:

异常:输入 0 与层 lstm_5 不兼容:预期 ndim=3,发现 ndim=2

Exception: Input 0 is incompatible with layer lstm_5: expected ndim=3, found ndim=2

好像需要把timestep维度传遍整个网络.我怎样才能做到这一点 ?如果我将它添加到卷积的 input_shape,它也会抱怨: Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode="valid", input_shape=[4, 1, 56,14])

It seems like I need to pass the timestep dimension through the entire network. How can I do that ? If I add it to the input_shape of the Convolution, it complains, too: Convolution2D(nb_filters, kernel_size[0], kernel_size[1], border_mode="valid", input_shape=[4, 1, 56,14])

异常:输入 0 与层卷积 2d_44 不兼容:预期 ndim=4,发现 ndim=5

Exception: Input 0 is incompatible with layer convolution2d_44: expected ndim=4, found ndim=5

推荐答案

根据 Convolution2D定义您的输入必须是具有维度 (samples, channels, rows, cols) 的 4 维.这是您收到错误的直接原因.

According to Convolution2D definition your input must be 4-dimensional with dimensions (samples, channels, rows, cols). This is the direct reason why are you getting an error.

要解决您必须使用 TimeDistributed 包装器的问题.这允许您在整个时间内使用静态(非循环)层.

To resolve that you must use TimeDistributed wrapper. This allows you to use static (not recurrent) layers across the time.

这篇关于Keras:重塑以连接 lstm 和 conv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆