合并3深度网络并端到端训练 [英] Merge 3 Deep Network and Train End-to-End

查看:49
本文介绍了合并3深度网络并端到端训练的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用深度学习概念,但它是一个初学者,我正在尝试使用3个深度神经网络模型建立特征融合概念,我的想法是我正在尝试从这三个模型中获取特征并执行在最后的单个S形层上进行分类,然后获取结果,这是我运行的代码.

I'm using a deep learning concept but a beginner in it, I'm trying to build a feature fusion concept using 3 deep neural network models, the idea is I'm trying to get features from all three models and do classification on the last single sigmoid layer and then get the results, here is the code that I run.

代码:

from keras.layers import Input, Dense
from keras.models import Model
from sklearn.model_selection import train_test_split
import numpy
# random seed for reproducibility
numpy.random.seed(2)
# loading load pima indians diabetes dataset, past 5 years of medical history
dataset = numpy.loadtxt('https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv', delimiter=",")
# split into input (X) and output (Y) variables, splitting csv data
X = dataset[:, 0:8]
Y = dataset[:, 8]
x_train, x_validation, y_train, y_validation = train_test_split(X, Y, test_size=0.20, random_state=5)
#create the input layer
input_layer = Input(shape=(8,))
A2 = Dense(8, activation='relu')(input_layer)
A3 = Dense(30, activation='relu')(A2)
B2 = Dense(40, activation='relu')(A2)
B3 = Dense(30, activation='relu')(B2)
C2 = Dense(50, activation='relu')(B2)
C3 = Dense(5, activation='relu')(C2)
merged = Model(inputs=[input_layer],outputs=[A3,B3,C3])
final_model = Dense(1, 
activation='sigmoid')(merged
final_model.compile(loss="binary_crossentropy",
              optimizer="adam", metrics=['accuracy'])
# call the function to fit to the data (training the network)
final_model.fit(x_train, y_train, epochs=2000, batch_size=50,
          validation_data=(x_validation, y_validation))
# evaluate the model
scores = final_model.evaluate(x_validation,y_validation)
print("\n%s: %.2f%%" % (final_model.metrics_names[1], scores[1] * 100))

这是我面临的错误

if x.shape.ndims is None:

AttributeError: 'Functional' object has no attribute 'shape'

请帮助我解决此问题,或者如果有人知道我应该使用什么代码,请告诉我我也愿意更改代码,但不愿意更改概念.

Please help me out to fix this issue, or if anyone knows what code should I use then let me know I'm also willing to change code but not concept Thank you.

根据@ M.Innat的回答,我们尝试了以下操作.这个想法是,我们首先构建3个模型,然后通过将这些模型与单个分类器结合在一起来构建最终模型/合并模型.但是我面临着一个差异.当我训练每个模型时,它们给出了90%的结果,但是当我将它们组合时,它们几乎达不到60或70.

From @M.Innat's answer, we've tried as follows. The idea is we first build 3 models and then build a final / combine model by joining these models with a single classifier. But I am facing a discrepancy. When I train each model, they gave 90% results but when I combine them, they hardly reach 60 or 70.

代码模型1:

   model = Sequential()
    # input layer requires input_dim param
    model.add(Dense(10, input_dim=8, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(50, activation='relu'))
    model.add(Dense(5, activation='relu'))
    # sigmoid instead of relu for final probability between 0 and 1
    model.add(Dense(1, activation='sigmoid'))
    
    # compile the model, adam gradient descent (optimized)
    model.compile(loss="binary_crossentropy",
                  optimizer="adam", metrics=['accuracy'])
    
    # call the function to fit to the data (training the network)
    model.fit(x_train, y_train, epochs=1000, batch_size=50,
              validation_data=(x_validation, y_validation))
    
    # evaluate the model
    
    scores = model.evaluate(X, Y)
    print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1] * 100))
    model.save('diabetes_risk_nn.h5')

模型1的准确度 = 94.14%.和另外两个模型一样.

MODEL 1 Accuracy = 94.14%. And same as another 2 models.

MODEL 2准确度 = 93.62%模型3准确度 = 92.71%

MODEL 2 Accuracy = 93.62% MODEL 3 Accuracy = 92.71%

接下来,按照@ M.Innat的建议合并模型.在这里,我们使用上面的模型1,2,3进行了此操作.但是得分并不接近90%.最终组合模型:

Next, as @M.Innat's suggested to merging the models. Here we have done that using the above Models 1,2,3. But the score is not near ~90%. FINAL Combined Model:

# Define Model A 
input_layer = Input(shape=(8,))
A2 = Dense(10, activation='relu')(input_layer)
A3 = Dense(50, activation='relu')(A2)
A4 = Dense(50, activation='relu')(A3)
A5 = Dense(50, activation='relu')(A4)
A6 = Dense(50, activation='relu')(A5)
A7 = Dense(50, activation='relu')(A6)
A8 = Dense(5, activation='relu')(A7)
model_a = Model(inputs=input_layer, outputs=A8, name="ModelA")

# Define Model B 
input_layer = Input(shape=(8,))
B2 = Dense(10, activation='relu')(input_layer)
B3 = Dense(50, activation='relu')(B2)
B4 = Dense(40, activation='relu')(B3)
B5 = Dense(60, activation='relu')(B4)
B6 = Dense(30, activation='relu')(B5)
B7 = Dense(50, activation='relu')(B6)
B8 = Dense(50, activation='relu')(B7)
B9 = Dense(5, activation='relu')(B8)
model_b = Model(inputs=input_layer, outputs=B9, name="ModelB")

# Define Model C
input_layer = Input(shape=(8,))
C2 = Dense(10, activation='relu')(input_layer)
C3 = Dense(50, activation='relu')(C2)
C4 = Dense(40, activation='relu')(C3)
C5 = Dense(40, activation='relu')(C4)
C6 = Dense(70, activation='relu')(C5)
C7 = Dense(50, activation='relu')(C6)
C8 = Dense(50, activation='relu')(C7)
C9 = Dense(60, activation='relu')(C8)
C10 = Dense(50, activation='relu')(C9)
C11 = Dense(5, activation='relu')(C10)
model_c = Model(inputs=input_layer, outputs=C11, name="ModelC")
all_three_models = [model_a, model_b, model_c]
all_three_models_input = Input(shape=all_three_models[0].input_shape[1:])

然后将这三个结合起来.

And then combine these three.

models_output = [model(all_three_models_input) for model in all_three_models]
Concat           = tf.keras.layers.concatenate(models_output, name="Concatenate")
final_out     = Dense(1, activation='sigmoid')(Concat)
final_model   = Model(inputs=all_three_models_input, outputs=final_out, name='Ensemble')
#tf.keras.utils.plot_model(final_model, expand_nested=True)
final_model.compile(loss="binary_crossentropy",
              optimizer="adam", metrics=['accuracy'])
# call the function to fit to the data (training the network)
final_model.fit(x_train, y_train, epochs=1000, batch_size=50,
          validation_data=(x_validation, y_validation))

# evaluate the model

scores = final_model.evaluate(x_validation,y_validation)
print("\n%s: %.2f%%" % (final_model.metrics_names[1], scores[1] * 100))
final_model.save('diabetes_risk_nn.h5')

但是与每个模型给出90%的模型不同,这种组合最终模型给出的准确度大约为 = 70%

But unlike each model where they gave 90%, this combine final model gave accuracy around =70%

推荐答案

根据您的代码,只有一个模型(不是三个).通过查看您尝试的输出,我认为您正在寻找这样的东西:

According to your code, there is only one model (not three). And by seeing the output that you tried, I think you're looking for something like this:

数据集

import tensorflow as tf 
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from sklearn.model_selection import train_test_split
import numpy

# random seed for reproducibility
numpy.random.seed(2)
# loading load pima indians diabetes dataset, past 5 years of medical history
dataset = numpy.loadtxt('https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.data.csv', delimiter=",")

# split into input (X) and output (Y) variables, splitting csv data
X = dataset[:, 0:8]
Y = dataset[:, 8]

x_train, x_validation, y_train, y_validation = train_test_split(X, Y, test_size=0.20, random_state=5)

模型

#create the input layer
input_layer = Input(shape=(8,))

A2 = Dense(8, activation='relu')(input_layer)
A3 = Dense(30, activation='relu')(A2)

B2 = Dense(40, activation='relu')(input_layer)
B3 = Dense(30, activation='relu')(B2)

C2 = Dense(50, activation='relu')(input_layer)
C3 = Dense(5, activation='relu')(C2)


merged = tf.keras.layers.concatenate([A3,B3,C3])
final_out = Dense(1, activation='sigmoid')(merged)

final_model = Model(inputs=[input_layer], outputs=final_out)
tf.keras.utils.plot_model(final_model)

培训

final_model.compile(loss="binary_crossentropy",
              optimizer="adam", metrics=['accuracy'])

# call the function to fit to the data (training the network)
final_model.fit(x_train, y_train, epochs=5, batch_size=50,
          validation_data=(x_validation, y_validation))

# evaluate the model
scores = final_model.evaluate(x_validation,y_validation)
print("\n%s: %.2f%%" % (final_model.metrics_names[1], scores[1] * 100))

Epoch 1/5
13/13 [==============================] - 1s 15ms/step - loss: 0.7084 - accuracy: 0.6803 - val_loss: 0.6771 - val_accuracy: 0.6883
Epoch 2/5
13/13 [==============================] - 0s 5ms/step - loss: 0.6491 - accuracy: 0.6600 - val_loss: 0.5985 - val_accuracy: 0.6623
Epoch 3/5
13/13 [==============================] - 0s 5ms/step - loss: 0.6161 - accuracy: 0.6813 - val_loss: 0.6805 - val_accuracy: 0.6883
Epoch 4/5
13/13 [==============================] - 0s 5ms/step - loss: 0.6335 - accuracy: 0.7003 - val_loss: 0.6115 - val_accuracy: 0.6623
Epoch 5/5
13/13 [==============================] - 0s 5ms/step - loss: 0.5684 - accuracy: 0.7285 - val_loss: 0.6150 - val_accuracy: 0.6883
5/5 [==============================] - 0s 2ms/step - loss: 0.6150 - accuracy: 0.6883

accuracy: 68.83%


更新

根据您的评论:

让我向您解释我要做什么,首先我分别创建3个模型DNN,然后尝试合并这些模型以获取所有模型的特征,之后,我想使用所有提取的特征进行分类,然后评估准确性.那就是我实际上正在尝试开发的东西.

Let me explain to you what I'm trying to do, firstly I create 3 models DNN separately then I try to combine those models to get features of all there, after that I want to classify using all extracted features and then evaluate the accuracy. That's what actually I'm trying to develop.

  • 分别创建3个模型-确定,3个模型
  • 组合它们以获得功能-好的,功能提取器
  • 进行分类-确定,对模型输出特征图进行平均,然后传递给分类器-换句话说,就是整合.
  • 让我们这样做.首先,分别构建三个模型.

    Let's do this. First, build three models separately.

    # Define Model A 
    input_layer = Input(shape=(8,))
    A2 = Dense(8, activation='relu')(input_layer)
    A3 = Dense(30, activation='relu')(A2)
    C3 = Dense(5, activation='relu')(A3)
    model_a = Model(inputs=input_layer, outputs=C3, name="ModelA")
    
    # Define Model B 
    input_layer = Input(shape=(8,))
    A2 = Dense(8, activation='relu')(input_layer)
    A3 = Dense(30, activation='relu')(A2)
    C3 = Dense(5, activation='relu')(A3)
    model_b = Model(inputs=input_layer, outputs=C3, name="ModelB")
    
    # Define Model C
    input_layer = Input(shape=(8,))
    A2 = Dense(8, activation='relu')(input_layer)
    A3 = Dense(30, activation='relu')(A2)
    C3 = Dense(5, activation='relu')(A3)
    model_c = Model(inputs=input_layer, outputs=C3, name="ModelC")
    

    我使用了相同数量的参数,请自行更改.无论如何,这三个模型充当每个特征提取器(而不是分类器).接下来,我们将通过平均对它们的输出进行合并,然后将其传递给分类器.

    I used the same number of parameters, change yourself. Anyway, these three models perform as each feature extractor (not classifier). Next, we will combine their output by averaging them and after that pass that to the classifier.

    all_three_models = [model_a, model_b, model_c]
    all_three_models_input = Input(shape=all_three_models[0].input_shape[1:])
    
    
    models_output = [model(all_three_models_input) for model in all_three_models]
    Avg           = tf.keras.layers.average(models_output, name="Average")
    final_out     = Dense(1, activation='sigmoid')(Avg)
    final_model   = Model(inputs=all_three_models_input, outputs=final_out, name='Ensemble')
    

    tf.keras.utils.plot_model(final_model, expand_nested=True)
    

    现在,您可以训练模型并在测试集上对其进行评估.希望这会有所帮助.

    Now, you can train the model and evaluate it on the test set. Hope this helps.

    (1).您可以添加种子.

    (1). You can add seed.

    from tensorflow.keras.models import Model
    from sklearn.model_selection import train_test_split
    import tensorflow as tf 
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense, Dropout
    from sklearn.model_selection import train_test_split
    import os, numpy
    
    # random seed for reproducibility
    numpy.random.seed(101)
    tf.random.set_seed(101)
    os.environ['TF_CUDNN_DETERMINISTIC'] = '1'
    
    dataset = .. your data 
    
    # split into input (X) and output (Y) variables, splitting csv data
    X = dataset[:, 0:8]
    Y = dataset[:, 8]
    x_train, x_validation, y_train, y_validation = train_test_split(X, Y,
                                                                    
                                                test_size=0.20, random_state=101)
    

    (2).尝试使用 SGD 优化器.另外,使用 ModelCheckpoint 回调保存最高的验证准确性.

    (2). Try with the SGD optimizer. Also, use the ModelCheckpoint callback to save the highest validation accuracy.

    final_model.compile(loss="binary_crossentropy",
                  optimizer="sgd", metrics=['accuracy'])
    
    model_save = tf.keras.callbacks.ModelCheckpoint(
                    'merge_best.h5',
                    monitor="val_accuracy",
                    verbose=0,
                    save_best_only=True,
                    save_weights_only=True,
                    mode="max",
                    save_freq="epoch"
                )
    
    # call the function to fit to the data (training the network)
    final_model.fit(x_train, y_train, epochs=1000, batch_size=256, callbacks=[model_save],
              validation_data=(x_validation, y_validation))
    

    评估测试集.

    # evaluate the model
    final_model.load_weights('merge_best.h5')
    scores = final_model.evaluate(x_validation,y_validation)
    print("\n%s: %.2f%%" % (final_model.metrics_names[1], scores[1] * 100))
    

    5/5 [==============================] - 0s 4ms/step - loss: 0.6543 - accuracy: 0.7662
    
    accuracy: 76.62%
    

    这篇关于合并3深度网络并端到端训练的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆