带多输入KerasClassifier的Sklearn cross_val_score [英] Sklearn cross_val_score with multi input KerasClassifier

查看：165 发布时间：2021/2/14 20:39:44 tensorflow scikit-learn keras

本文介绍了带多输入KerasClassifier的Sklearn cross_val_score的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

目标是对具有多个输入的Keras模型执行交叉验证.这对于只有一个输入的普通顺序模型可以很好地工作.但是，当使用功能性api并将其扩展到两个输入时，sklearns cross_val_score似乎无法按预期工作.

def create_model():
    input_text = Input(shape=(1,), dtype=tf.string)
    embedding = Lambda(UniversalEmbedding, output_shape=(512, ))(input_text)
    dense = Dense(256, activation='relu')(embedding)

    input_title = Input(shape=(1,), dtype=tf.string)
    embedding_title = Lambda(UniversalEmbedding, output_shape=(512, ))(input_title)
    dense_title = Dense(256, activation='relu')(embedding_title)

    out = Concatenate()([dense, dense_title])

    pred = Dense(2, activation='softmax')(out)
    model = Model(inputs=[input_text, input_title], outputs=pred)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

    return model

失败的部分

keras_classifier = KerasClassifier(build_fn=create_model, epochs=10, batch_size=10, verbose=1)
cv = StratifiedKFold(n_splits=10, random_state=0)
results = cross_val_score(keras_classifier, [X1, X2], y, cv=cv, scoring='f1_weighted')

错误

Traceback (most recent call last):
  File "func.py", line 73, in <module>
    results = cross_val_score(keras_classifier, [X1, X2], y, cv=cv, scoring='f1_weighted')
  File "/home/timisb/.local/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 402, in cross_val_score
    error_score=error_score)
  File "/home/timisb/.local/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 225, in cross_validate
    X, y, groups = indexable(X, y, groups)
  File "/home/timisb/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 260, in indexable
    check_consistent_length(*result)
  File "/home/timisb/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 235, in check_consistent_length
    " samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [2, 643]

有人对此有替代方法或解决方案的建议吗?谢谢！

解决方案

我找到了下面的原因.

您可以将顺序Keras模型(仅单输入)用作您的Scikit-Learn工作流程，可通过以下网址找到的包装器 keras.wrappers.scikit_learn.py.

https://keras.io/scikit-learn-api/

The goal is to perform cross validation on a Keras model with multiple inputs. This works fine with a normal sequential model with only one input. However, when using the functional api and extending to two inputs sklearns cross_val_score does not seem to work as expected.

def create_model():
    input_text = Input(shape=(1,), dtype=tf.string)
    embedding = Lambda(UniversalEmbedding, output_shape=(512, ))(input_text)
    dense = Dense(256, activation='relu')(embedding)

    input_title = Input(shape=(1,), dtype=tf.string)
    embedding_title = Lambda(UniversalEmbedding, output_shape=(512, ))(input_title)
    dense_title = Dense(256, activation='relu')(embedding_title)

    out = Concatenate()([dense, dense_title])

    pred = Dense(2, activation='softmax')(out)
    model = Model(inputs=[input_text, input_title], outputs=pred)
    model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

    return model

part that fails

keras_classifier = KerasClassifier(build_fn=create_model, epochs=10, batch_size=10, verbose=1)
cv = StratifiedKFold(n_splits=10, random_state=0)
results = cross_val_score(keras_classifier, [X1, X2], y, cv=cv, scoring='f1_weighted')

error

Traceback (most recent call last):
  File "func.py", line 73, in <module>
    results = cross_val_score(keras_classifier, [X1, X2], y, cv=cv, scoring='f1_weighted')
  File "/home/timisb/.local/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 402, in cross_val_score
    error_score=error_score)
  File "/home/timisb/.local/lib/python3.6/site-packages/sklearn/model_selection/_validation.py", line 225, in cross_validate
    X, y, groups = indexable(X, y, groups)
  File "/home/timisb/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 260, in indexable
    check_consistent_length(*result)
  File "/home/timisb/.local/lib/python3.6/site-packages/sklearn/utils/validation.py", line 235, in check_consistent_length
    " samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [2, 643]

Does anyone have an alternative approach to this, or suggestions of a solution? Thanks!

解决方案

I found the reason which is below.

You can use Sequential Keras models (single-input only) as part of your Scikit-Learn workflow via the wrappers found at keras.wrappers.scikit_learn.py.

https://keras.io/scikit-learn-api/

这篇关于带多输入KerasClassifier的Sklearn cross_val_score的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

带多输入KerasClassifier的Sklearn cross_val_score [英] Sklearn cross_val_score with multi input KerasClassifier

问题描述

失败的部分

错误

part that fails

error

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

带多输入KerasClassifier的Sklearn cross_val_score [英] Sklearn cross_val_score with multi input KerasClassifier

问题描述

失败的部分

错误

part that fails

error

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭