使用 Keras 和 sklearn GridSearchCV 交叉验证提前停止 [英] Early stopping with Keras and sklearn GridSearchCV cross-validation

查看:52
本文介绍了使用 Keras 和 sklearn GridSearchCV 交叉验证提前停止的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望使用 Keras 和 sklean 的 GridSearchCV 实现提前停止.

下面的工作代码示例修改自 How to Grid使用 Keras 在 Python 中搜索深度学习模型的超参数.数据集可能是 从这里下载.

修改增加了 Keras EarlyStopping 回调类,防止过拟合.为了使其有效,它需要 monitor='val_acc' 参数来监控验证准确性.要使 val_acc 可用 KerasClassifier 需要 validation_split=0.1 来生成验证准确性,否则 EarlyStopping 引发 运行时警告:提前停止需要 val_acc 可用!.注意 FIXME: 代码注释!

注意我们可以用 val_loss 替换 val_acc

问题:如何使用由 GridSearchCV k-fold 算法生成的交叉验证数据集,而不是为了提前停止而浪费 10% 的训练数据验证集?

#使用scikit-learn网格搜索学习率和动量导入 numpy从 sklearn.model_selection 导入 GridSearchCV从 keras.models 导入顺序从 keras.layers 导入密集从 keras.wrappers.scikit_learn 导入 KerasClassifier从 keras.optimizers 导入 SGD# 创建模型的函数,KerasClassifier 需要def create_model(learn_rate=0.01,momentum=0):# 创建模型模型 = 顺序()model.add(Dense(12, input_dim=8, activation='relu'))model.add(Dense(1, activation='sigmoid'))# 编译模型优化器 = SGD(lr=learn_rate,momentum=momentum)model.compile(loss='binary_crossentropy',optimizer=optimizer,metrics=['accuracy'])回报模式# 提前停止从 keras.callbacks 导入 EarlyStoppingstopper = EarlyStopping(monitor='val_acc',耐心=3,详细=1)# 修复随机种子以提高可重复性种子 = 7numpy.random.seed(种子)# 加载数据集数据集 = numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",")# 拆分为输入(X)和输出(Y)变量X = 数据集[:,0:8]Y = 数据集[:,8]# 创建模型模型 = KerasClassifier(build_fn=create_model,epochs=100,batch_size=10,validation_split=0.1, # FIXME:改为使用 GridSearchCV k-fold 验证数据.详细=2)# 定义网格搜索参数学习率 = [0.01, 0.1]动量 = [0.2, 0.4]param_grid = dict(learn_rate=learn_rate,动量=动量)grid = GridSearchCV(estimator=model, param_grid=param_grid,verbose=2, n_jobs=1)# 拟合参数fit_params = dict(callbacks=[stopper])# 网格搜索.grid_result = grid.fit(X, Y, **fit_params)# 总结结果打印(最佳:%f 使用 %s" % (grid_result.best_score_, grid_result.best_params_))mean = grid_result.cv_results_['mean_test_score']stds = grid_result.cv_results_['std_test_score']params = grid_result.cv_results_['params']对于 zip 中的均值、标准差、参数(均值、标准差、参数):print("%f (%f) 与:%r" % (mean, stdev, param))

解决方案

在急于解决实施问题之前,花一些时间思考方法论和任务本身总是一个好习惯;可以说,将提前停止与交叉验证程序混合在一起不是一个好主意.

让我们举一个例子来突出这个论点.

假设您确实使用了 100 个 epoch 的提前停止和 5 倍交叉验证 (CV) 来选择超参数.还假设您最终得到的超参数集 X 可提供最佳性能,例如 89.3% 的二元分类准确率.

现在假设您的次优超参数集 Y 的准确度为 89.2%.仔细检查各个 CV 折叠,您会发现,对于最佳情况 X,5 次 CV 折叠中有 3 次耗尽了最多 100 个 epoch,而在另外 2 个提前停止中,分别在 95 和 93 个 epoch.

现在想象一下,检查第二好的 Y 组,您会再次看到 5 个 CV 折叠中有 3 个耗尽了 100 个 epoch,而其他 2 个都在大约 80 个 epoch 处足够早地停止.

你从这样的实验中得出什么结论?

可以说,您会发现自己处于不确定的境地;进一步的实验可能会揭示哪个实际上是最好的超参数集,当然前提是您首先会考虑查看结果的这些细节.不用说,如果所有这些都是通过回调自动完成的,那么您可能会错过最好的模型,尽管您实际上会尝试.

<小时>

整个 CV 的想法都隐含地基于所有其他人都平等"的论点(这当然在实践中永远不会正确,只会以最好的方式近似).如果你觉得 epochs 的数量应该是一个超参数,就直接把它明确地包含在你的 CV 中,而不是通过提前停止的后门插入,这样可能会影响整个过程(更不用说提前停止 自身有一个超参数耐心).

不混合这两种技术当然并不意味着您不能连续使用它们:一旦您通过 CV 获得了最佳超参数,您就可以在拟合模型时使用提前停止整个训练集(当然前提是你有一个单独的验证集).

<小时>

深度神经网络领域还(非常)年轻,确实还没有建立其最佳实践"指南;添加一个事实,多亏了一个了不起的社区,开源实现中有各种可用的工具,您可以很容易地发现自己处于(公认的诱人)位置,仅仅因为它们碰巧可用就将它们混合起来.我不一定是说这就是您在这里尝试做的事情 - 我只是敦促在结合可能并非旨在协同工作的想法时更加谨慎......

I wish to implement early stopping with Keras and sklean's GridSearchCV.

The working code example below is modified from How to Grid Search Hyperparameters for Deep Learning Models in Python With Keras. The data set may be downloaded from here.

The modification adds the Keras EarlyStopping callback class to prevent over-fitting. For this to be effective it requires the monitor='val_acc' argument for monitoring validation accuracy. For val_acc to be available KerasClassifier requires the validation_split=0.1 to generate validation accuracy, else EarlyStopping raises RuntimeWarning: Early stopping requires val_acc available!. Note the FIXME: code comment!

Note we could replace val_acc by val_loss!

Question: How can I use the cross-validation data set generated by the GridSearchCV k-fold algorithm instead of wasting 10% of the training data for an early stopping validation set?

# Use scikit-learn to grid search the learning rate and momentum
import numpy
from sklearn.model_selection import GridSearchCV
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasClassifier
from keras.optimizers import SGD

# Function to create model, required for KerasClassifier
def create_model(learn_rate=0.01, momentum=0):
    # create model
    model = Sequential()
    model.add(Dense(12, input_dim=8, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    # Compile model
    optimizer = SGD(lr=learn_rate, momentum=momentum)
    model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
    return model

# Early stopping
from keras.callbacks import EarlyStopping
stopper = EarlyStopping(monitor='val_acc', patience=3, verbose=1)

# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataset = numpy.loadtxt("pima-indians-diabetes.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
# create model
model = KerasClassifier(
    build_fn=create_model,
    epochs=100, batch_size=10,
    validation_split=0.1, # FIXME: Instead use GridSearchCV k-fold validation data.
    verbose=2)
# define the grid search parameters
learn_rate = [0.01, 0.1]
momentum = [0.2, 0.4]
param_grid = dict(learn_rate=learn_rate, momentum=momentum)
grid = GridSearchCV(estimator=model, param_grid=param_grid, verbose=2, n_jobs=1)

# Fitting parameters
fit_params = dict(callbacks=[stopper])
# Grid search.
grid_result = grid.fit(X, Y, **fit_params)

# summarize results
print("Best: %f using %s" % (grid_result.best_score_, grid_result.best_params_))
means = grid_result.cv_results_['mean_test_score']
stds = grid_result.cv_results_['std_test_score']
params = grid_result.cv_results_['params']
for mean, stdev, param in zip(means, stds, params):
    print("%f (%f) with: %r" % (mean, stdev, param))

解决方案

[Answer after the question was edited & clarified:]

Before rushing into implementation issues, it is always a good practice to take some time to think about the methodology and the task itself; arguably, intermingling early stopping with the cross validation procedure is not a good idea.

Let's make up an example to highlight the argument.

Suppose that you indeed use early stopping with 100 epochs, and 5-fold cross validation (CV) for hyperparameter selection. Suppose also that you end up with a hyperparameter set X giving best performance, say 89.3% binary classification accuracy.

Now suppose that your second-best hyperparameter set, Y, gives 89.2% accuracy. Examining closely the individual CV folds, you see that, for your best case X, 3 out of the 5 CV folds exhausted the max 100 epochs, while in the other 2 early stopping kicked in, say in 95 and 93 epochs respectively.

Now imagine that, examining your second-best set Y, you see that again 3 out of the 5 CV folds exhausted the 100 epochs, while the other 2 both stopped early enough at ~ 80 epochs.

What would be your conclusion from such an experiment?

Arguably, you would have found yourself in an inconclusive situation; further experiments might reveal which is actually the best hyperparameter set, provided of course that you would have thought to look into these details of the results in the first place. And needless to say, if all this was automated through a callback, you might have missed your best model despite the fact that you would have actually tried it.


The whole CV idea is implicitly based on the "all other being equal" argument (which of course is never true in practice, only approximated in the best possible way). If you feel that the number of epochs should be a hyperparameter, just include it explicitly in your CV as such, rather than inserting it through the back door of early stopping, thus possibly compromising the whole process (not to mention that early stopping has itself a hyperparameter, patience).

Not intermingling these two techniques doesn't mean of course that you cannot use them sequentially: once you have obtained your best hyperparameters through CV, you can always employ early stopping when fitting the model in your whole training set (provided of course that you do have a separate validation set).


The field of deep neural nets is still (very) young, and it is true that it has yet to establish its "best practice" guidelines; add the fact that, thanks to an amazing community, there are all sort of tools available in open source implementations, and you can easily find yourself into the (admittedly tempting) position of mixing things up just because they happen to be available. I am not necessarily saying that this is what you are attempting to do here - I am just urging for more caution when combining ideas that may have not been designed to work along together...

这篇关于使用 Keras 和 sklearn GridSearchCV 交叉验证提前停止的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆