在Keras中使用膨胀卷积 [英] Using dilated convolution in Keras

查看:376
本文介绍了在Keras中使用膨胀卷积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


从插图中,您可以看到核大小为2的扩张卷积和2的幂的扩张率形成了树形的接收场结构。我试图(非常简单地)在Keras中复制上面的内容。

  import tensorflow.keras as keras 
nn = input_layer = keras.layers .input(shape =(200,2))
nn = keras.layers.Conv1D(5,5,padding ='causal',dilation_rate = 2)(nn)
nn = keras.layers。 Conv1D(5,5,padding ='causal',dilation_rate = 4)(nn)
nn = keras.layers.Dense(1)(nn)
model = keras.Model(input_layer,nn)
opt = keras.optimizers.Adam(lr = 0.001)
model.compile(loss ='mse',优化程序= opt)
model.summary()

和输出:

  _________________________________________________________________ 
图层(类型)输出形状参数#
========================================= =======================
input_4(InputLayer)[[None,200,2)] 0
_________________________________________________________________
conv1d_5(Conv1D)(None,200,5)55
_______________________ __________________________________________
conv1d_6(Conv1D)(无,200,5)130
_________________________________________________________________
density_2(密集)(无,200,1)6
======== ================================================== =======
总参数:191
可训练参数:191
非可训练参数:0
_________________________________________________________________

我期望 axis = 1 在每次 conv1d 之后会缩小层,类似于gif。为什么不是这种情况?

解决方案

模型摘要符合预期。如您所注意到的,使用膨胀卷积会导致接收场的增加。但是,膨胀卷积实际上会保留我们输入图像/激活的输出形状,因为我们只是在更改卷积内核。常规内核可能如下

  0 1 0 
1 1 1
0 1 0

膨胀率为2的内核会在原始内核的每个条目之间添加零,如下所示。

  0 0 1 0 0 
0 0 0 0 0
1 0 1 0 1
0 0 0 0 0
0 0 1 0 0

实际上,您可以看到我们的原始核也是膨胀率为1的膨胀核。增加接收场的其他方法会导致输入图像的尺寸减小。最大池化和跨步卷积是2种替代方法。


例如。如果要通过减小输出形状的大小来增加接收场,则可以使用如下所示的跨步卷积。我用跨步卷积代替膨胀卷积。您将看到输出形状减少了每一层。

  import tensorflow.keras as keras 
nn = input_layer = keras.layers.Input (shape =(200,2))
nn = keras.layers.Conv1D(5,5,padding ='causal',stride = 2)(nn)
nn = keras.layers.Conv1D( 5,5,padding ='causal',步幅= 4)(nn)
nn = keras.layers.Dense(1)(nn)
model = keras.Model(input_layer,nn)
opt = keras.optimizers.Adam(lr = 0.001)
model.compile(loss ='mse',optimizer = opt)
model.summary()

模型: model_1
层(类型)输出形状参数#
=========================== ===================================
input_2(InputLayer)[[None,200, 2)] 0
_________________________________________________________________
conv1d_3(Conv1D)(无,100、5)55
_________________________________________________________________
conv1d_4(Conv1D)(无,25、5)130
_________________________________________________________________
density_1(密集)(无,25,1)6
============================ ==================================
总参数:191
可训练参数:191
不可训练参数:0
_________________________________________________________________

总结膨胀卷积只是增加模型的感知领域。这样做的好处是可以保留输入图像的输出形状。


In WaveNet, dilated convolution is used to increase receptive field of the layers above.

From the illustration, you can see that layers of dilated convolution with kernel size 2 and dilation rate of powers of 2 create a tree like structure of receptive fields. I tried to (very simply) replicate the above in Keras.

import tensorflow.keras as keras
nn = input_layer = keras.layers.Input(shape=(200, 2))
nn = keras.layers.Conv1D(5, 5, padding='causal', dilation_rate=2)(nn)
nn = keras.layers.Conv1D(5, 5, padding='causal', dilation_rate=4)(nn)
nn = keras.layers.Dense(1)(nn)
model = keras.Model(input_layer, nn)
opt = keras.optimizers.Adam(lr=0.001)
model.compile(loss='mse', optimizer=opt)
model.summary()

And the output:

_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_4 (InputLayer)         [(None, 200, 2)]          0
_________________________________________________________________
conv1d_5 (Conv1D)            (None, 200, 5)            55
_________________________________________________________________
conv1d_6 (Conv1D)            (None, 200, 5)            130
_________________________________________________________________
dense_2 (Dense)              (None, 200, 1)            6
=================================================================
Total params: 191
Trainable params: 191
Non-trainable params: 0
_________________________________________________________________

I was expecting axis=1 to shrink after each conv1d layer, similar to the gif. Why is this not the case?

解决方案

The model summary is as expected. As you note using dilated convolutions results in an increase in the receptive field. However, dilated convolution actually preserves the output shape of our input image/activation as we are just changing the convolutional kernel. A regular kernel could be the following

0 1 0
1 1 1
0 1 0

A kernel with a dilation rate of 2 would add zeros in between each entry in our original kernel as below.

0 0 1 0 0
0 0 0 0 0
1 0 1 0 1
0 0 0 0 0
0 0 1 0 0

In fact you can see that our original kernel is also a dilated kernel with a dilation rate of 1. Alternative ways to increase the receptive field result in a downsizing of the input image. Max pooling and strided convolution are 2 alternative methods.

For example. if you want to increase the receptive field by decreasing the size of your output shape you could use strided convolution as below. I replace the dilated convolution with a strided convolution. You will see that the output shape reduces every layer.

import tensorflow.keras as keras
nn = input_layer = keras.layers.Input(shape=(200, 2))
nn = keras.layers.Conv1D(5, 5, padding='causal', strides=2)(nn)
nn = keras.layers.Conv1D(5, 5, padding='causal', strides=4)(nn)
nn = keras.layers.Dense(1)(nn)
model = keras.Model(input_layer, nn)
opt = keras.optimizers.Adam(lr=0.001)
model.compile(loss='mse', optimizer=opt)
model.summary()

Model: "model_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
input_2 (InputLayer)         [(None, 200, 2)]          0
_________________________________________________________________
conv1d_3 (Conv1D)            (None, 100, 5)            55
_________________________________________________________________
conv1d_4 (Conv1D)            (None, 25, 5)             130
_________________________________________________________________
dense_1 (Dense)              (None, 25, 1)             6
=================================================================
Total params: 191
Trainable params: 191
Non-trainable params: 0
_________________________________________________________________

To summarize dilated convolution is just another way to increase the receptive field of your model. It has the benefit of preserving the output shape of your input image.

这篇关于在Keras中使用膨胀卷积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆