Keras中的正则化策略 [英] Regularization strategy in Keras

查看:65
本文介绍了Keras中的正则化策略的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在Keras中设置非线性回归问题.不幸的是,结果表明过度拟合正在发生.这是代码,

I have trying to setup a non-linear regression problem in Keras. Unfortunately, results show that overfitting is occurring. Here is the code,

model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)

此处未显示正则化结果未进行正则化.与验证相比,训练的平均绝对误差要小得多,并且两者都有固定的差距,这是过度拟合的迹象.

The results without regularization is shown here Without regularization. The mean absolute error for training is much less compared to validation, and both have a fixed gap which is a sign of over-fitting.

L2正则化

model = Sequential()
model.add(Dense(number_of_neurons, input_dim=X_train.shape[1], activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation = 'relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(int(number_of_neurons), activation='relu',kernel_regularizer=regularizers.l2(0.001)))
model.add(Dense(outdim, activation='linear'))
Adam = optimizers.Adam(lr=0.001)
model.compile(loss='mean_squared_error', optimizer=Adam, metrics=['mae'])
model.fit(X, Y, epochs=1000, batch_size=500, validation_split=0.2, shuffle=True, verbose=2 , initial_epoch=0)

这些结果显示在 L2正则化结果中.测试的MAE接近培训,这很好.但是,用于训练的MAE较差,为0.03(如果不进行正规化,则为0.0028,则要低得多).

The results for these are shown here L2 regularized result. The MAE for test is close to training which is good. However, the MAE for training is poor at 0.03 (without regularization it was much lower at 0.0028).

如何通过正则化减少训练MAE?

What can i do to reduce the training MAE with regularization?

推荐答案

根据您的结果,您似乎需要找到适当数量的正则化方法,以平衡训练精度和对测试集的良好泛化.这可能与减少L2参数一样简单.尝试将λ值从0.001减少到0.0001并比较结果.

Based on your results, it looks like you need to find the right amount of regularization to balance training accuracy with good generalization to the test set. This may be as simple as reducing the L2 parameter. Try reducing lambda from 0.001 to 0.0001 and comparing your results.

如果找不到适用于L2的良好参数设置,则可以尝试进行辍学正则化.只需在每对致密层之间添加model.add(Dropout(0.2)),并在必要时尝试辍学率.辍学率越高,则正则化程度越高.

If you can't find a good parameter setting for L2, you could try dropout regularization instead. Just add model.add(Dropout(0.2)) between each pair of dense layers, and experiment with the dropout rate if necessary. A higher dropout rate corresponds to more regularization.

这篇关于Keras中的正则化策略的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆