如何在Keras拟合期间使用隐藏层激活来构造损失函数并提供y_true? [英] How to use hidden layer activations to construct loss function and provide y_true during fitting in Keras?

查看:241
本文介绍了如何在Keras拟合期间使用隐藏层激活来构造损失函数并提供y_true?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个这样的模型. M1和M2是连接模型左侧和右侧的两层. 示例模型:红线表示反向传播方向

Assume I have a model like this. M1 and M2 are two layers linking left and right sides of the model. The example model: Red lines indicate backprop directions

在培训期间,我希望M1可以学习从L2_left激活到L2_right激活的映射.同样,M2可以学习从L3_right激活到L3_left激活的映射. 该模型还需要学习两个输入和输出之间的关系. 因此,我应该分别对M1,M2和L3_left具有三个损失函数.

During training, I hope M1 can learn a mapping from L2_left activation to L2_right activation. Similarly, M2 can learn a mapping from L3_right activation to L3_left activation. The model also needs to learn the relationship between two inputs and the output. Therefore, I should have three loss functions for M1, M2, and L3_left respectively.

我可能可以使用:

model.compile(optimizer='rmsprop',
          loss={'M1': 'mean_squared_error',
                'M2': 'mean_squared_error', 
                'L3_left': mean_squared_error'})

但是在培训期间,我们需要提供y_true,例如:

But during training, we need to provide y_true, for example:

model.fit([input_1,input_2], y_true)

在这种情况下,y_true是隐藏层激活,而不是数据集中的激活. 是否可以构建此模型并使用其隐藏层激活对其进行训练?

In this case, the y_true is the hidden layer activations and not from a dataset. Is it possible to build this model and train it using it's hidden layer activations?

推荐答案

如果您仅一个输出,则必须具有仅一个损失函数.

If you have only one output, you must have only one loss function.

如果您要三个损失函数,则必须具有三个输出,当然还有三个Y向量才能进行训练.

If you want three loss functions, you must have three outputs, and, of course, three Y vectors for training.

如果要在模型中间使用损失函数,则必须从这些层获取输出.

If you want loss functions in the middle of the model, you must take outputs from those layers.

创建模型图:(如果模型已经定义,请参见此答案的结尾)

#Here, all "SomeLayer(blabla)" could be replaced by a "SomeModel" if necessary
    #Example of using a layer or a model:
        #M1 = SomeLayer(blablabla)(L12) 
        #M1 = SomeModel(L12)

from keras.models import Model
from keras.layers import *

inLef = Input((shape1))   
inRig = Input((shape2))

L1Lef = SomeLayer(blabla)(inLef)
L2Lef = SomeLayer(blabla)(L1Lef)
M1 = SomeLayer(blablaa)(L2Lef) #this is an output

L1Rig = SomeLayer(balbla)(inRig)

conc2Rig = Concatenate(axis=?)([L1Rig,M1]) #Or Add, or Multiply, however you're joining the models    
L2Rig = SomeLayer(nlanlab)(conc2Rig)
L3Rig = SomeLayer(najaljd)(L2Rig)

M2 = SomeLayer(babkaa)(L3Rig) #this is an output

conc3Lef = Concatenate(axis=?)([L2Lef,M2])
L3Lef = SomeLayer(blabla)(conc3Lef) #this is an output

使用三个输出创建模型:

现在您已经准备好图形,并且知道输出是什么,然后创建模型:

Now you've got your graph ready and you know what the outputs are, you create the model:

model = Model([inLef,inRig], [M1,M2,L3Lef])
model.compile(loss='mse', optimizer='rmsprop')

如果您希望每个输出具有不同的损失,则可以创建一个列表:

If you want different losses for each output, then you create a list:

#example of custom loss function, if necessary
def lossM1(yTrue,yPred):
    return keras.backend.sum(keras.backend.abs(yTrue-yPred))

#compiling with three different loss functions
model.compile(loss = [lossM1, 'mse','binary_crossentropy'], optimizer =??)

但是您还必须进行三种不同的yTraining培训:

But you've got to have three different yTraining too, for training with:

model.fit([input_1,input_2], [yTrainM1,yTrainM2,y_true], ....)

如果您的模型已经定义并且您没有像我一样创建它的图形:

然后,您必须在yourModel.layers[i]中找到M1和M2,因此您需要创建一个新模型,如下所示:

Then, you have to find in yourModel.layers[i] which ones are M1 and M2, so you create a new model like this:

M1 = yourModel.layers[indexForM1].output
M2 = yourModel.layers[indexForM2].output
newModel = Model([inLef,inRig], [M1,M2,yourModel.output])

如果您希望两个输出相等:

在这种情况下,只需将lambda层中的两个输出相减,并使该lambda层成为模型的输出,期望值= 0.

In this case, just subtract the two outputs in a lambda layer, and make that lambda layer be an output of your model, with expected values = 0.

使用与以前完全相同的var,我们将创建两个成瘾层来减去输出:

Using the exact same vars as before, we'll just create two addictional layers to subtract outputs:

diffM1L1Rig = Lambda(lambda x: x[0] - x[1])([L1Rig,M1])
diffM2L2Lef = Lambda(lambda x: x[0] - x[1])([L2Lef,M2])

现在您的模型应该是:

newModel = Model([inLef,inRig],[diffM1L1Rig,diffM2L2lef,L3Lef])    

培训会期望这两个差异为零:

And training will expect those two differences to be zero:

yM1 = np.zeros((shapeOfM1Output))
yM2 = np.zeros((shapeOfM2Output))
newModel.fit([input_1,input_2], [yM1,yM2,t_true], ...)

这篇关于如何在Keras拟合期间使用隐藏层激活来构造损失函数并提供y_true?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆