Keras的CNN模型条件层 [英] CNN model conditional layer in Keras

查看:50
本文介绍了Keras的CNN模型条件层的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试建立一个条件CNN 模型.该模型是

在模型的第一阶段中,我将数据馈送到 Model 1 ,然后,根据模型1的预测,想要将模型训练为条件猫模型或条件狗模型,最后给出条件猫模型或条件狗模型的输出.我该怎么做?

注意:我的努力是

 导入keras从keras.layers导入*从keras.models导入*从keras.utils导入*img_rows,img_cols,分类数= 256,256,2输入=输入(形状=(img_rows,img_cols,3))#-----------主要模型(模型1)---------------------------------conv_01 = Convolution2D(64,3,3,激活='relu',名称='conv_01')(输入)conv_02 = Convolution2D(64,3,3,激活='relu',名称='conv_02')(conv_01)skip_dog =转换02conv_03 = Convolution2D(64,3,3,激活='relu',名称='conv_03')(conv_02)skip_cat =转换03conv_04 = Convolution2D(64,3,3,激活='relu',名称='conv_04')(conv_03)flatten_main_model = Flatten()(conv_04)Output_main_model =密集(单位= class_number_of_class,激活='softmax',名称="Output_layer")(flatten_main_model)#-----------条件猫模型-------------------------------------conv_05 = Convolution2D(64,3,3,激活='relu',名称='conv_05')(skip_cat)flatten_cat_model = Flatten()(conv_05)Output_cat_model =密集(单位= class_number_of_class,激活='softmax',name ="Output_layer_cat")(flatten_cat_model)#-----------条件狗模型-------------------------------------conv_06 = Convolution2D(64,3,3,激活='relu',名称='conv_06')(skip_dog)flatten_dog_model = Flatten()(conv_06)Output_dog_model =密集(单位= class_number_of_class,激活='softmax',name ="Output_layer_dog")(flatten_dog_model)#-----------------------------我的3个离散模型--------------------------------model_01 =模型(输入=输入,输出= Output_main_model,名称='model_main')model_02_1 =模型(输入=输入,输出= Output_cat_model,名称='Conditional_cat_model')model_02_2 =模型(输入=输入,输出= Output_dog_model,名称='Conditional_dog_model') 

如何根据这些条件合并这3个模型( model_01,model_02_1,model_02_2 )?

**条件是:**

  1. 将数据馈送到模型 model_01
  2. 基于 model_01 结果提要数据到 model_02_1或model_02_2
  3. 接下来,预测 model_02_1或model_02_2
  4. 的最终输出

解决方案

神经网络中的条件问题

作为神经网络的一部分的开关或条件语句(如if-then-else)的问题在于条件语句并非在所有地方都是可微的.因此,自动微分方法将无法直接发挥作用,解决该问题非常复杂.选中

下一个问题是切换问题.您想在图中的2条可能的路径之间切换.您可以做的是一种软交换方法,该方法是我从

了解了这两个概念后,让我们尝试直观地构建我们的体系结构.

  1. 首先我们需要一个单输入多输出图,因为我们正在训练2个模型

  2. 我们的第一个模型是多类分类,可以分别预测猫和狗的个体概率.这将通过激活 softmax categorical_crossentropy 损失进行训练.

  3. 接下来,让我们采用logit预测Cat的概率,然后将卷积层3乘以logit.这可以通过 Lambda 层完成.

  4. 类似地,让我们将Dog的概率乘以卷积层2.这可以看成是-

    • 如果我的第一个模型完美地预测出猫而不是狗,那么计算将是 1 *(Conv3) 0 *(Conv2).
    • li>
    • 如果第一个模型完美地预测了狗而不是猫,那么计算将为 0 *(Conv3) 1 *(Conv2)
    • 您可以将其视为

      I am trying to build a conditional CNN model. The model is,

      At the first stage of my model, I feed my data to Model 1 then, based on the prediction of Model 1, I want to train the model to Conditional Cat model or Conditional Dog model and finally, give the output from Conditional Cat model or Conditional Dog model. How Can I do this?

      Note: My effort is,

      import keras
      from keras.layers import *
      from keras.models import *
      from keras.utils import *
      
      img_rows,img_cols,number_of_class = 256,256,2
      input = Input(shape=(img_rows,img_cols,3))
      
      #----------- main model (Model 1) ------------------------------------
      conv_01 = Convolution2D(64, 3, 3, activation='relu',name = 'conv_01') (input)
      conv_02 = Convolution2D(64, 3, 3, activation='relu',name = 'conv_02') (conv_01)
      
      skip_dog =  conv_02
      
      conv_03 = Convolution2D(64, 3, 3, activation='relu',name = 'conv_03') (conv_02)
      
      skip_cat =  conv_03
      
      conv_04 = Convolution2D(64, 3, 3, activation='relu',name = 'conv_04') (conv_03)
      
      
      flatten_main_model =  Flatten() (conv_04)
      Output_main_model = Dense(units = number_of_class , activation = 'softmax', name = "Output_layer")(flatten_main_model)
      
      
      #----------- Conditional  Cat model ------------------------------------ 
      conv_05 = Convolution2D(64, 3, 3, activation='relu',name = 'conv_05') (skip_cat)
      flatten_cat_model =  Flatten() (conv_05)
      Output_cat_model = Dense(units = number_of_class , activation = 'softmax', name = "Output_layer_cat")(flatten_cat_model)
      
      #----------- Conditional  Dog model ------------------------------------ 
      conv_06 = Convolution2D(64, 3, 3, activation='relu',name = 'conv_06') (skip_dog)
      flatten_dog_model =  Flatten() (conv_06)
      Output_dog_model = Dense(units = number_of_class , activation = 'softmax', name = "Output_layer_dog")(flatten_dog_model)
      
      #----------------------------- My discrete 3 models --------------------------------
      model_01 = Model(inputs = input , outputs = Output_main_model,name = 'model_main')
      model_02_1 = Model(inputs = input , outputs = Output_cat_model ,name = 'Conditional_cat_model')
      model_02_2 = Model(inputs = input , outputs = Output_dog_model ,name = 'Conditional_dog_model')
      

      How can I merge these 3 models (model_01, model_02_1, model_02_2) based on these conditions?

      **Conditions are: **

      1. Feed data to model model_01
      2. Based on model_01 result feed data to model_02_1 or model_02_2
      3. Next, predict the final output from model_02_1 or model_02_2

      解决方案

      The problem with conditionals in neural networks

      The issue with a switch or conditionals (like if-then-else) as part of a neural network is that conditionals are not differentiable everywhere. Therefore the automatic differentiation methods would not work directly and solving this is super complex. Check this for more details.

      A shortcut is you can end up training 3 separate models independently, and then during inference uses a control flow of conditionals to infer from them.

      #Training - 
      model1 = model.fit(all images, P(cat/dog))
      model2 = model.fit(all images, P(cat))
      model3 = model.fit(all images, P(dog))
      final prediction = argmax(model2, model3)
      
      #Inference - 
      if model1.predict == Cat: 
          model2.predict
      else:
          model3.predict
      

      But I don't think you are looking for that. I think you are looking to include conditionals as part of the computation graph itself.

      Sadly, there is no direct way for you to build an if-then condition as part of a computation graph as per my knowledge. The keras.switch that you see allows you to work with tensor outputs but not with layers of a graph during training. That's why you will see it being used as part of loss functions and not in computation graphs (throws input errors).

      A possible Solution - Skip connections & soft-switching

      You can, however, try to build something similar with skip connections and soft switching.

      A skip connection is a connection from a previous layer to another layer that allows you to pass information to the subsequent layers. This is quite common in very deep networks where information from the original data is subsequently lost. Check U-net or Resnet for example, which uses skip connections between layers to pass information to future layers.

      The next issue is the issue of switching. You want to switch between 2 possible paths in the graph. What you can do is a soft-switching method which I took as inspiration from this paper. Notice that in order to switch between 2 distribution of words (one from the decoder and another from the input), the authors multiply them with p and (1-p) to get a cumulative distribution. This is a soft-switch that allows the model to pick the next predicted word from either the decoder or from the input itself. (helps when you want your chatbot to speak the words that were input by the user as part of its response to them!)

      With an understanding of these 2 concepts, let's try to intuitively build our architecture.

      1. First we need a single-input multi-output graph since we are training 2 models

      2. Our first model is a multi-class classification that predicts individual probabilities for Cat and Dog separately. This will be trained with the activation of softmax and a categorical_crossentropy loss.

      3. Next, let's take the logit which predicts the probability of Cat, and multiply the convolution layer 3 with it. This can be done with a Lambda layer.

      4. And similarly, let's take the probability of Dog and multiply it with the convolution layer 2. This can be seen as the following -

        • If my first model predicts a cat and not a dog, perfectly, then the computation will be 1*(Conv3) and 0*(Conv2).
        • If the first model predicts a dog and not a cat, perfectly, then the computation will be 0*(Conv3) and 1*(Conv2)
        • You can think of this as either a soft-switch OR a forget gate from LSTM. The forget gate is a sigmoid (0 to 1) output that multiplies the cell state to gate it and allow the LSTM to forget or remember previous time-steps. Similar concept here!
      5. These Conv3 and Conv2 can now be further be processed, flattened, concatenated, and passed to another Dense layer for the final prediction.

      This way if the model is not sure about a dog or a cat, both conv2 and conv3 features participate in the second model's predictions. This is how you can use skip connections and soft switch inspired mechanism to add some amount of conditional control flow to your network.

      Check my implementation of the computation graph below.

      from tensorflow.keras import layers, Model, utils
      import numpy as np
      
      X = np.random.random((10,500,500,3))
      y = np.random.random((10,2))
      
      #Model
      inp = layers.Input((500,500,3))
      
      x = layers.Conv2D(6, 3, name='conv1')(inp)
      x = layers.MaxPooling2D(3)(x)
      
      c2 = layers.Conv2D(9, 3, name='conv2')(x)
      c2 = layers.MaxPooling2D(3)(c2)
      
      c3 = layers.Conv2D(12, 3, name='conv3')(c2)
      c3 = layers.MaxPooling2D(3)(c3)
      
      x = layers.Conv2D(15, 3, name='conv4')(c3)
      x = layers.MaxPooling2D(3)(x)
      
      x = layers.Flatten()(x)
      out1 = layers.Dense(2, activation='softmax', name='first')(x)
      
      c = layers.Lambda(lambda x: x[:,:1])(out1)
      d = layers.Lambda(lambda x: x[:,1:])(out1)
      
      c = layers.Multiply()([c3, c])
      d = layers.Multiply()([c2, d])
      
      c = layers.Conv2D(15, 3, name='conv5')(c)
      c = layers.MaxPooling2D(3)(c)
      c = layers.Flatten()(c)
      
      d = layers.Conv2D(12, 3, name='conv6')(d)
      d = layers.MaxPooling2D(3)(d)
      d = layers.Conv2D(15, 3, name='conv7')(d)
      d = layers.MaxPooling2D(3)(d)
      d = layers.Flatten()(d)
      
      x = layers.concatenate([c,d])
      x = layers.Dense(32)(x)
      out2 = layers.Dense(2, activation='softmax',name='second')(x)
      
      model = Model(inp, [out1, out2])
      model.compile(optimizer='adam', loss='categorical_crossentropy', loss_weights=[0.5, 0.5])
      
      model.fit(X, [y, y], epochs=5)
      
      utils.plot_model(model, show_layer_names=False, show_shapes=True)
      

      Epoch 1/5
      1/1 [==============================] - 1s 1s/step - loss: 0.6819 - first_loss: 0.7424 - second_loss: 0.6214
      Epoch 2/5
      1/1 [==============================] - 0s 423ms/step - loss: 0.6381 - first_loss: 0.6361 - second_loss: 0.6400
      Epoch 3/5
      1/1 [==============================] - 0s 442ms/step - loss: 0.6137 - first_loss: 0.6126 - second_loss: 0.6147
      Epoch 4/5
      1/1 [==============================] - 0s 434ms/step - loss: 0.6214 - first_loss: 0.6159 - second_loss: 0.6268
      Epoch 5/5
      1/1 [==============================] - 0s 427ms/step - loss: 0.6248 - first_loss: 0.6184 - second_loss: 0.6311
      

      这篇关于Keras的CNN模型条件层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆