如何使用Keras构建多类卷积神经网络 [英] How to build a multi-class convolutional neural network with Keras

查看:179
本文介绍了如何使用Keras构建多类卷积神经网络的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用带有Keras和Tensorflow后端的U-Net来实现图像分割任务.我将大小为(128,96)的图像与大小为(12288,6)的蒙版图像一起输入到网络,因为它们被展平了.我有6个不同的类(0-5),它们给出了蒙版图像形状的第二部分.使用to_categorical()函数将它们编码为一键式标签.目前,我只使用一张输入图像,也使用同一张图像作为验证和测试数据.

I am trying to implement a U-Net with Keras with Tensorflow backend for an image segmentation task. I have images of size (128,96) as input to the network together with mask images of size (12288,6) since they are flattened. I have 6 different classes (0-5) which gives the second part of the mask images' shape. They have been encoded to one-hot labels using the to_categorical() function. At the moment I use just one input image and also use the same one as validation and test data.

我希望U-Net进行图像分割,其中0类对应于背景.现在,我仅在几个时期(1-10)内训练我的U-Net,生成的预测蒙版图像似乎只是为每个像素提供了随机类别.当我训练网络的时间更长(50多个纪元)时,所有像素都被归类为背景.由于我使用相同的图像进行训练和测试,因此在加速网络进行过度训练时,我发现这很奇怪.我该如何解决这个问题?我将遮罩图像和真实图像提供给网络的方式是否有问题?

I would like the U-Net to perform image segmentation, where class 0 corresponds to the background. When I now train my U-Net only for a few epochs (1-10), the resulting predicted mask image seems to just give random classes to each pixel. When I train the network longer (50+ epochs), all pixels are classified as background. Since I train and test using the same image, I find this very weird as I was expedting the network to overtrain. How can I fix this problem? Could there be something wrong with the way I give mask images and the real images to the network?

我尝试过手动权衡网络,以减少对背景的重视程度,并尝试了不同的损失组合,不同的蒙版图像塑造方法以及许多其他方法,但是没有什么能带来良好的结果.

I have tried giving weights to the network manually to put less emphasis on background than the other classes and have tried different combinations of losses, different ways of shaping the mask image and many other things but nothing gave good results.

下面是我的网络代码.它基于从此存储库提取的U-Net.我设法在两节课的情况下对它进行了培训,并取得了良好的效果,但是我现在不知道如何将其扩展到更多的课.

Below is the code of my network. It is based on the U-Net taken from this repository. I managed to train it for the two class case with good results but I don't know how to extend it to more classes now.

def get_unet(self):

    inputs = Input((128, 96,1))
    #Input shape=(?,128,96,1)

    conv1 = Conv2D(64, (3,3), activation = 'relu', padding = 'same',
      kernel_initializer = 'he_normal', input_shape=(None,128,96,6))(inputs)
    #Conv1 shape=(?,128,96,64)
    conv1 = Conv2D(64, (3,3), activation = 'relu', padding = 'same',
          kernel_initializer = 'he_normal')(conv1)
    #Conv1 shape=(?,128,96,64)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    #pool1 shape=(?,64,48,64)


    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(pool1)
    #Conv2 shape=(?,64,48,128)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(conv2)
    #Conv2 shape=(?,64,48,128)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    #Pool2 shape=(?,32,24,128)

    conv5 = Conv2D(256, (3,3), activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(pool2)
    conv5 = Conv2D(256, (3,3), activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(conv5)

    up8 = Conv2D(128, 2, activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv5))
    merge8 = concatenate([conv2,up8], axis = 3)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(merge8)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same',
         kernel_initializer = 'he_normal')(conv8)


    up9 = Conv2D(64, (2,2), activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
    merge9 = concatenate([conv1,up9], axis = 3)
    conv9 = Conv2D(64, (3,3), activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(merge9)
    conv9 = Conv2D(64, (3,3), activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(conv9)
    conv9 = Conv2D(6, (3,3), activation = 'relu', padding = 'same',
        kernel_initializer = 'he_normal')(conv9)

    conv10 = Conv2D(6, (1,1), activation = 'sigmoid')(conv9)
    conv10 = Reshape((128*96,6))(conv10)

    model = Model(input = inputs, output = conv10)
    model.compile(optimizer = Adam(lr = 1e-5), loss = 'binary_crossentropy',
          metrics = ['accuracy'])

    return model

谁能指出我的模型出了什么问题?

Can anyone point out what is wrong with my model?

推荐答案

以我的经验,也使用U-net进行细分.它倾向于这样做:

In my experience, also with a U-net for segmentation. It tends to do this:

  • 转到全黑或全白
  • 经过很多时间,损失似乎被冻结了,然后才找到了解决方法.

我还使用只训练一个图像"方法来找到收敛,然后添加其他图像就可以了.

I also use the "train just one image" method to find that convergence, then adding the other images is ok.

但是我不得不尝试很多次,并且它运行得非常快的唯一一次是我使用时:

But I had to try a lot of times, and the only time it worked pretty fast was when I used:

  • 最终激活='sigmoid'
  • loss ='binary_crossentropy'

但是我没有在任何地方使用"relu" ...也许会稍微影响收敛速度...?考虑只有0或正数结果的"relu",此函数中有一个很大的区域没有梯度.也许有很多"relu"激活会创建很多没有梯度的平坦"区域? (必须仔细考虑才能确认)

But I wasn't using "relu" anywhere...perhaps that influences a little the convergence speed...? Thinking about "relu", which has only 0 or positive results, there is a big region in this function that does not have a gradient. Maybe having lots of "relu" activations creates a lot of "flat" areas without gradients? (Must think better about it to confirm)

尝试几次使用不同的权重初始化(并耐心等待许多纪元).

Try a few times (and have patience to wait for many many epochs) with different weight initializations.

您的学习率也有可能太大.

There is a chance that your learning rate is too big too.

关于to_categorical():您是否尝试绘制/打印口罩?他们真的像您期望的那样吗?

About to_categorical(): have you tried to plot/print your masks? Do they really seem like what you expect them to?

这篇关于如何使用Keras构建多类卷积神经网络的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆