如何在python中的keras功能api中执行交叉验证 [英] How to perform cross-validation in keras functional api in python

查看:582
本文介绍了如何在python中的keras功能api中执行交叉验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对具有多个输入的Keras模型执行交叉验证.因此,我尝试了KerasClassifier.对于只有一个输入的普通顺序模型,这可以很好地工作.但是,当使用功能性api并将其扩展到两个输入时,sklearn的cross_val_predict似乎无法按预期工作.

I want to perform cross validation on a Keras model with multiple inputs. So, I tried KerasClassifier. This works fine with a normal sequential model with only one input. However, when using the functional api and extending to two inputs sklearn's cross_val_predict does not seem to work as expected.

def create_model():
    input_text = Input(shape=(1,), dtype=tf.string)
    embedding = Lambda(UniversalEmbedding, output_shape=(512, ))(input_text)
    dense = Dense(256, activation='relu')(embedding)    
    input_title = Input(shape=(1,), dtype=tf.string)
    embedding_title = Lambda(UniversalEmbedding, output_shape=(512, ))(input_title)
    dense_title = Dense(256, activation='relu')(embedding_title)    
    out = Concatenate()([dense, dense_title])

    pred = Dense(2, activation='softmax')(out)
    model = Model(inputs=[input_text, input_title], outputs=pred)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

    return model

交叉验证码失败

keras_classifier = KerasClassifier(build_fn=create_model, epochs=10, batch_size=10, verbose=1)
cv = StratifiedKFold(n_splits=10, random_state=0)
results = cross_val_predict(keras_classifier, [X1, X2], y, cv=cv, method = "predict_proba")

后来,我发现KerasClassifier仅支持顺序模型: https://keras.io /scikit-learn-api/.换句话说,它不支持具有多个输入的功能性api.

Later, I discovered that KerasClassifier only supports sequential models: https://keras.io/scikit-learn-api/. In other words, it does not support functional api with multiple inputs.

因此,我想知道是否还有其他方法可以对在keras中使用功能性api的模型执行交叉验证.更具体地说,我想获得每个数据实例的预测概率(当它在交叉验证中的测试切片中时)-这就是cross_val_predict发生的情况.

Therefore, I am wondering if there is any other way to perform cross-validation for models that uses functional api in keras. More specifically, I want to get the prediction probability of each data instance (when it is in test slice in cross-validation) - this is what happens with cross_val_predict.

如果需要,我很乐意提供更多详细信息.

I am happy to provide more details if needed.

我当前的问题是如何向StratifiedKFold.split()输入多个输入.我已经在代码中放入了????????????.只考虑是否可以将其指定为[input1, input2, input3, input4, input5]

My current question is how to input multiple inputs to StratifiedKFold.split(). I have put ???????????? in the code. Just thinking whether it is possible to give it as [input1, input2, input3, input4, input5]

假设我有5个输入,分别为input1input2input3input4input5,如何在StratifiedKFold.split()

Suppose, I have 5 inputs as input1, input2, input3, input4, input5, how can I use these inputs in StratifiedKFold.split()

k_fold = StratifiedKFold(n_splits=10, shuffle=True, random_state=0)

for train_index, test_index in k_fold.split(????????????, labels):

    print("iteration", i, ":")
    print("train indices:", train_index)

    #input1
    print("train data:", input1[train_index])

    #input2
    print("train data:", input2[train_index])

    #input3
    print("train data:", input3[train_index])

    #input4
    print("train data:", input1[train_index])

    #input5
    print("train data:", input1[train_index])

    print("test indices:", test_index)
    print("test data:", X[test_index])

推荐答案

有趣的一点是sklearn仅支持Sequential,但查看您的模型,我认为您可以使用单个输入,因为它们共享嵌入等:

Interesting point that sklearn only supports Sequential but looking your model I think you can have a single input since they share the embedding etc:

def create_model():
    model = Sequential()
    model.add(Lambda(UniversalEmbedding, output_shape=(2, 512), input_shape=(2,)))
    # (2, 512)
    model.add(Flatten()) # (2*512)
    model.add(Dense(2*256, activation='relu')) # (2*256)
    model.add(Dense(2, activation='softmax'))
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

    return model

换句话说,您具有以相同方式嵌入相同域的2个输入,因此您可以使用大小为2的单个输入.然后要模拟两个密集层,请展平并获得一个具有两倍大小的Dense层: ),这将带您到模型相同的级联层.

In words, you have 2 inputs of the same domain embedded the same way, so you can use a single input of size 2. Then to mimic two dense layers, you flatten and have a single Dense layer with twice the size :) This brings you to the concatenated layer from which the model is the same.

这篇关于如何在python中的keras功能api中执行交叉验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆