发出带有素数输入维度的培训CNN [英] Issues Training CNN with Prime number input dimensions

查看:96
本文介绍了发出带有素数输入维度的培训CNN的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用Keras(自动编码器)开发CNN模型.此类型的输入的形状为(47,47,3),即一个具有3个(RGB)层的47x47图像.

I am currently developing a CNN model with Keras (an autoencoder). This type my inputs are of shape (47,47,3), that is a 47x47 image with 3 (RGB) layers.

我过去曾经使用过某些CNN,但是这次我的输入尺寸是质数(47像素).我认为这会导致我的实现出现问题,特别是在模型中使用MaxPooling2DUpSampling2D时.我注意到,在最大池化然后向上采样时会丢失某些尺寸.

I have worked with some CNN's in the past, but this time my input dimensions are prime numbers (47 pixels). This I think is causing issues with my implementation, specifically when using MaxPooling2D and UpSampling2D in my model. I noticed that some dimensions are lost when max pooling and then up sampling.

使用model.summary(),我看到将 (47,47,3) 输入通过Conv2D(24)和带有(2,2)内核的MaxPooling(即24个滤镜和一半形状)传递后,我得到了输出形状为 (24, 24, 24) .

Using model.summary() I can see that after passing my (47,47,3) input through a Conv2D(24) and MaxPooling with a (2,2) kernel (that is 24 filters and half the shape) I get a output shape of (24, 24, 24).

现在,如果我尝试通过使用(2,2)内核(形状加倍)的UpSampling进行反转并再次卷积,则会得到 (48,48,3) 形状的输出.那是一排多余的行和列.

Now, if I try to reverse that by UpSampling with a (2,2) kernel (double the shape) and convolving again I get a (48,48,3) shaped output. That is one extra row and column than needed.

为此,我认为没问题,只需选择一个内核大小即可在向上采样时为您提供所需的47个像素" ,但是考虑到47是质数,在我看来没有内核大小可以做到这一点.

To this I thought "no problem, just chose a kernel size that gives you the desired 47 pixels when up sampling", but given that 47 is a prime number it seems to me that there is no kernel size that can do that.

有什么方法可以绕过这个问题,而不必涉及将输入尺寸更改为非质数?也许我的方法中缺少某些东西,或者Keras具有某些我可以忽略的功能,在这里帮助.

Is there any way to bypass this problem that does not involve changing my input dimensions to a non-prime? Maybe I am missing something in my approach or maybe Keras has some feature I ignore that could help here.

推荐答案

我建议您使用 ZeroPadding2D Cropping2D .您可以使用0 s不对称地填充图像,而无需调整大小即可获得均匀大小的图像.这应该解决上采样的问题.此外,请记住在所有卷积层中设置padding=same.

I advice you to use ZeroPadding2D and Cropping2D. You can pad your image asymmetrically with 0s and obtain an even size of your image without resizing it. This should solve the problem with upsampling. Moreover - remember about setting padding=same in all of your convolutional layers.

仅向您提供有关如何执行此类操作的示例策略:

Just to give you an example strategy on how to perform such operations:

  1. 如果在合并网络之前,网络的大小是奇数-对其进行零填充以使其均匀.
  2. 在进行相应的上采样操作后,请使用裁切功能将特征图恢复为原始的奇数大小.

这篇关于发出带有素数输入维度的培训CNN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆