在Keras中将遮罩应用于Conv2D内核 [英] Applying a mask to Conv2D kernel in Keras

查看:99
本文介绍了在Keras中将遮罩应用于Conv2D内核的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻求对Keras中Conv2D层的内核应用掩码.我在理解内核形状时遇到了一些困难.

I'm looking to apply a mask to the kernel of a Conv2D layer in Keras. I'm having a bit of difficulty understanding kernel shape.

对于kernel_size = 3,过滤器= 1,内核的形状为(3,3,4,1)=>(kernel_size,kernel_size,???,过滤器)

For kernel_size = 3, and filters = 1, the shape of the kernel is (3, 3, 4, 1) => (kernel_size, kernel_size, ???, filters)

内核中的第3维代表什么?

What does the 3rd dimension in the kernel represent?

如何获取NxN掩码并将其乘以每个内核过滤器?

How can I take an NxN mask and multiply it to each of the kernel filters?

这是我到目前为止的代码.我不确定它是否会按我预期的那样工作,因为我不完全了解内核的形状.

This is the code I have so far. I am not sure if it will work as I expect it to because I don't fully understand the kernel shape.

class MaskedConv2D(tf.keras.layers.Layer):
    def __init__(self, *args, **kwargs):
        super(MaskedConv2D, self).__init__()
        self.conv2d = Conv2D(*args, **kwargs)
        
    def build(self, input_shape):
        self.conv2d.build(input_shape[0])
        self._convolution_op = self.conv2d._convolution_op
        
    def masked_convolution_op(self, filters, kernel, mask):
        m = K.expand_dims(K.expand_dims(mask[0, ...], axis=2), axis=3) # (3, 3) => (3, 3, 1, 1)
        m = K.tile(m, (1, 1, kernel.shape[2], kernel.shape[3])) # (3, 3, 1, 1) => (3, 3, 4, 1)
        return self._convolution_op(filters, tf.math.multiply(kernel, m))        
        
    def call(self, inputs):
        x, mask = inputs
        self.conv2d._convolution_op = functools.partial(self.masked_convolution_op, mask=mask)
        return self.conv2d.call(x)

推荐答案

第一:内核大小

2D卷积的内核大小如下

First: The Kernel size

The kernel size for 2D convolution is as follows

[ height, width, input_filters, output_filters ]

第三维尺寸与输入过滤器的尺寸相同.这很关键.

The third dimension is of the same size as the input filters. This is critical.

让我们考虑一下如何手动进行卷积.步骤如下:

Let's consider how convolution is done manually. Here are the steps:

  • 从一批图像(BatchSize,height,width,filters)中获取补丁
  • 将其重塑为[BatchSize,高度*宽度*过滤器]
  • 矩阵将其乘以整形后的内核[高度*宽度*过滤器,输出过滤器]
  • 这时我们的数据是[BatchSize,output_filters]形状
  • 通过在整个批次中广播来增加形状[output_filters]的偏差

输出是每个补丁的过滤器.

The output is the filters for each patch.

鉴于我们知道卷积中的权重是整形的像 [高度,宽度,input_filters,output_filters] ,我们想正确地应用 [高度,宽度] 的蒙版,可以像这样广播该蒙版

Given that we know the weights in the convolution are shaped like [ height, width, input_filters, output_filters ] and we want to properly apply a mask of [ height, width ], can can just broadcast that mask like so

masked_weight = weight * mask.reshape([height,width,1,1])

我们的Tensorflow keras层可以这样写

Our Tensorflow keras layer could be written like so

class MaskedConv2D(tf.keras.layers.Layer):
    def __init__(self, *args, **kwargs):
        super(MaskedConv2D, self).__init__()
        self.conv2d = tf.keras.layers.Conv2D(*args, **kwargs)
        
    def build(self, input_shape):
        self.conv2d.build(input_shape[0])
        self._convolution_op = self.conv2d._convolution_op
        
    def masked_convolution_op(self, filters, kernel, mask):
        return self._convolution_op(filters, tf.math.multiply(kernel, tf.reshape(mask, mask.shape + [1,1] )))
        
    def call(self, inputs):
        x, mask = inputs
        self.conv2d._convolution_op = functools.partial(self.masked_convolution_op, mask=mask)
        return self.conv2d.call(x)

我们可以使用以下脚本对其进行测试

and we can test it with the following script

mcon = MaskedConv2D(filters=2,kernel_size=[3,3])

# hack: initialize it by running some data through it
mcon((np.ones([1,4,4,3], dtype=np.float32), tf.constant([[1,1,0],[1,1,1],[0,1,1]], dtype=tf.float32)))

# set all the weights to 1 for testing
mcon.set_weights([ np.ones([3,3,3,2]) , np.zeros([2]) ])

# pass in a matrix of 1s and mask out 2 elements for each input filter
mcon((np.ones([1,4,4,3], dtype=np.float32), tf.constant([[1,1,0],[1,1,1],[0,1,1]], dtype=tf.float32)))

具有可预测的输出

<tf.Tensor: shape=(1, 2, 2, 2), dtype=float32, numpy=
    array([[[[21., 21.],
             [21., 21.]],
            [[21., 21.],
             [21., 21.]]]], dtype=float32)>

这篇关于在Keras中将遮罩应用于Conv2D内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆