Tensorflow Keras:将两个不同的模型输出合并为一个 [英] Keras, Tensorflow : Merge two different model output into one

查看:847
本文介绍了Tensorflow Keras:将两个不同的模型输出合并为一个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究一种深度学习模型,其中我试图将两种不同模型的输出结合起来:

I am working on one deep learning model where I am trying to combine two different model's output :

总体结构如下:

因此第一个模型采用一个矩阵,例如[10 x 30]

So the first model takes one matrix, for example [ 10 x 30 ]

#input 1
input_text          = layers.Input(shape=(1,), dtype="string")
embedding           = ElmoEmbeddingLayer()(input_text)
model_a             = Model(inputs = [input_text] , outputs=embedding)
                      # shape : [10,50]

现在,第二个模型采用两个输入矩阵:

Now the second model takes two input matrix :

X_in               = layers.Input(tensor=K.variable(np.random.uniform(0,9,[10,32])))
M_in               = layers.Input(tensor=K.variable(np.random.uniform(1,-1,[10,10]))

md_1               = New_model()([X_in, M_in]) #new_model defined somewhere
model_s            = Model(inputs = [X_in, A_in], outputs = md_1)
                     # shape : [10,50]

我想像TensorFlow中那样使这两个矩阵可训练,我能够通过以下方式做到这一点:

I want to make these two matrices trainable like in TensorFlow I was able to do this by :

matrix_a = tf.get_variable(name='matrix_a',
                           shape=[10,10],
                           dtype=tf.float32,
                                 initializer=tf.constant_initializer(np.array(matrix_a)),trainable=True)

我对如何使matrix_a和matrix_b变得可训练以及如何合并两个网络的输出然后提供输入一无所知.

I am not getting any clue how to make those matrix_a and matrix_b trainable and how to merge the output of both networks then give input.

我遇到了这个问题,但是找不到一个回答,因为他们的问题陈述与我的不同.

I went through this question But couldn't find an answer because their problem statement is different from mine.

到目前为止,我尝试过的是:

What I have tried so far is :

#input 1
input_text          = layers.Input(shape=(1,), dtype="string")
embedding           = ElmoEmbeddingLayer()(input_text)
model_a             = Model(inputs = [input_text] , outputs=embedding)
                      # shape : [10,50]

X_in               = layers.Input(tensor=K.variable(np.random.uniform(0,9,[10,10])))
M_in               = layers.Input(tensor=K.variable(np.random.uniform(1,-1,[10,100]))

md_1               = New_model()([X_in, M_in]) #new_model defined somewhere
model_s            = Model(inputs = [X_in, A_in], outputs = md_1)
                    # [10,50]


#tranpose second model output

tranpose          = Lambda(lambda x: K.transpose(x))
agglayer          = tranpose(md_1)

# concat first and second model output
dott             = Lambda(lambda x: K.dot(x[0],x[1]))
kmean_layer     = dotter([embedding,agglayer])


# input 
final_model = Model(inputs=[input_text, X_in, M_in], outputs=kmean_layer,name='Final_output')
final_model.compile(loss = 'categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
final_model.summary() 

模型概述:

更新:

b型

X = np.random.uniform(0,9,[10,32])
M = np.random.uniform(1,-1,[10,10])


X_in = layers.Input(tensor=K.variable(X))
M_in = layers.Input(tensor=K.variable(M))



layer_one       = Model_b()([M_in, X_in])
dropout2       = Dropout(dropout_rate)(layer_one)
layer_two      = Model_b()([layer_one, X_in])

model_b_ = Model([X_in, M_in], layer_two, name='model_b')

模型a

length = 150


dic_size = 100
embed_size = 12

input_text = Input(shape=(length,))
embedding = Embedding(dic_size, embed_size)(input_text)

embedding = LSTM(5)(embedding) 
embedding = Dense(10)(embedding)

model_a = Model(input_text, embedding, name = 'model_a')

我正在像这样合并:

mult = Lambda(lambda x: tf.matmul(x[0], x[1], transpose_b=True))([embedding, model_b_.output])



final_model = Model(inputs=[model_b_.input[0],model_b_.input[1],model_a.input], outputs=mult)

让两个keras模型成熟是正确的方法吗?

Is it right way to matmul two keras model?

我不知道我是否正确合并了输出并且模型正确.

I don't know if I am merging the output correctly and the model is correct.

如果有人在我应该如何使矩阵可训练以及如何正确合并模型输出然后提供输入方面给我一些建议,我将不胜感激.

I would greatly appreciate it if anyone kindly gives me some advice on how should I make that matrix trainable and how to merge the model's output correctly then give input.

提前谢谢!

推荐答案

可训练的砝码

好的.由于您将具有自定义的可训练权重,因此在Keras中执行此操作的方法是创建自定义图层.

Trainable weights

Ok. Since you are going to have custom trainable weights, the way to do this in Keras is creating a custom layer.

现在,由于您的自定义图层没有输入,因此我们将需要进行进一步的解释.

Now, since your custom layer has no inputs, we will need a hack that will be explained later.

因此,这是自定义权重的图层定义:

So, this is the layer definition for the custom weights:

from keras.layers import *
from keras.models import Model
from keras.initializers import get as get_init, serialize as serial_init
import keras.backend as K
import tensorflow as tf


class TrainableWeights(Layer):

    #you can pass keras initializers when creating this layer
    #kwargs will take base layer arguments, such as name and others if you want
    def __init__(self, shape, initializer='uniform', **kwargs):
        super(TrainableWeights, self).__init__(**kwargs)
        self.shape = shape
        self.initializer = get_init(initializer)


    #build is where you define the weights of the layer
    def build(self, input_shape):
        self.kernel = self.add_weight(name='kernel', 
                                      shape=self.shape, 
                                      initializer=self.initializer, 
                                      trainable=True)
        self.built = True


    #call is the layer operation - due to keras limitation, we need an input
    #warning, I'm supposing the input is a tensor with value 1 and no shape or shape (1,)
    def call(self, x):
        return x * self.kernel


    #for keras to build the summary properly
    def compute_output_shape(self, input_shape):
        return self.shape


    #only needed for saving/loading this layer in model.save()
    def get_config(self):
        config = {'shape': self.shape, 'initializer': serial_init(self.initializer)}
        base_config = super(TrainableWeights, self).get_config()
        return dict(list(base_config.items()) + list(config.items()))

现在,该层应像这样使用:

Now, this layer should be used like this:

dummyInputs = Input(tensor=K.constant([1]))
trainableWeights = TrainableWeights(shape)(dummyInputs)

模型A

定义了图层之后,我们就可以开始建模了.
首先,让我们看一下model_a端:

Model A

Having the layer defined, we can start modeling.
First, let's see the model_a side:

#general vars
length = 150
dic_size = 100
embed_size = 12

#for the model_a segment
input_text = Input(shape=(length,))
embedding = Embedding(dic_size, embed_size)(input_text)

#the following two lines are just a resource to reach the desired shape
embedding = LSTM(5)(embedding) 
embedding = Dense(50)(embedding)

#creating model_a here is optional, only if you want to use model_a independently later
model_a = Model(input_text, embedding, name = 'model_a')

B型

为此,我们将使用我们的TrainableWeights层.
但首先,让我们模拟一个提到的New_model().

Model B

For this, we are going to use our TrainableWeights layer.
But first, let's simulate a New_model() as mentioned.

#simulates New_model() #notice the explicit batch_shape for the matrices
newIn1 = Input(batch_shape = (10,10))
newIn2 = Input(batch_shape = (10,30))
newOut1 = Dense(50)(newIn1)
newOut2 = Dense(50)(newIn2)
newOut = Add()([newOut1, newOut2])
new_model = Model([newIn1, newIn2], newOut, name='new_model')   

现在整个分支:

#the matrices    
dummyInput = Input(tensor = K.constant([1]))
X_in = TrainableWeights((10,10), initializer='uniform')(dummyInput)
M_in = TrainableWeights((10,30), initializer='uniform')(dummyInput)

#the output of the branch   
md_1 = new_model([X_in, M_in])

#optional, only if you want to use model_s independently later
model_s = Model(dummyInput, md_1, name='model_s')

整个模型

最后,我们可以在整个模型中加入分支.
请注意,在这里我不必使用model_amodel_s.您可以根据需要执行此操作,但是不需要这些子模型,除非以后要单独获取它们以用于其他用途. (即使创建了它们,也不需要更改下面的代码来使用它们,它们已经在同一张图中)

The whole model

Finally, we can join the branches in a whole model.
Notice how I didn't have to use model_a or model_s here. You can do it if you want, but those submodels are not needed, unless you want later to get them individually for other usages. (Even if you created them, you don't need to change the code below to use them, they're already part of the same graph)

#I prefer tf.matmul because it's clear and understandable while K.dot has weird behaviors
mult = Lambda(lambda x: tf.matmul(x[0], x[1], transpose_b=True))([embedding, md_1])

#final model
model = Model([input_text, dummyInput], mult, name='full_model')

现在训练它:

model.compile('adam', 'binary_crossentropy', metrics=['accuracy'])
model.fit(np.random.randint(0,dic_size, size=(128,length)),
          np.ones((128, 10)))

由于现在输出为2D,所以'categorical_crossentropy'毫无问题,我的评论是由于对输出形状的怀疑.

Since the output is 2D now, there is no problem about the 'categorical_crossentropy', my comment was because of doubts on the output shape.

这篇关于Tensorflow Keras:将两个不同的模型输出合并为一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆