自定义损失功能没有随着时间的推移而改善 [英] Custom loss function not improving with epochs

查看:54
本文介绍了自定义损失功能没有随着时间的推移而改善的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个自定义损失函数来处理二进制类的不平衡问题,但是我的损失函数在每个时期都没有改善.对于指标,我使用的是精确度和召回率.

I have created a custom loss function to deal with binary class imbalance, but my loss function does not improve per epoch. For metrics, I'm using precision and recall.

这是我没有选择好的超参数的设计问题吗?

Is this a design issue where I'm not picking good hyper-parameters?

weights = [np.array([.10,.90]), np.array([.5,.5]), np.array([.1,.99]), np.array([.25,.75]), np.array([.35,.65])]
for weight in weights:
    print('Model with weights {a}'.format(a=weight))
    model = keras.models.Sequential([
    keras.layers.Flatten(), #input_shape=[X_train.shape[1]]
    keras.layers.Dense(32, activation='relu'),
    keras.layers.Dense(32, activation='relu'),
    keras.layers.Dense(1, activation='sigmoid')])
    model.compile(loss=weighted_loss(weight),metrics=[tf.keras.metrics.Precision(), tf.keras.metrics.Recall()])     
    
    n_epochs = 10
    history = model.fit(X_train.astype('float32'), y_train.values.astype('float32'), epochs=n_epochs, validation_data=(X_test.astype('float32'), y_test.values.astype('float32')), batch_size=64)   
    model.evaluate(X_test.astype('float32'), y_test.astype('float32'))
    pd.DataFrame(history.history).plot(figsize=(8, 5))
    plt.grid(True); plt.gca().set_ylim(0, 1); plt.show()

用于处理类不平衡问题的自定义损失函数:

Custom loss function to deal with class imbalance issue:

def weighted_loss(weights):
    weights = K.variable(weights)            
    def loss(y_true, y_pred):
        y_pred /= K.sum(y_pred, axis=-1, keepdims=True)
        y_pred = K.clip(y_pred, K.epsilon(), 1 - K.epsilon())
        loss = y_true * K.log(y_pred) * weights
        loss = -K.sum(loss, -1)      
        return loss
    return loss

输出:

Model with weights [0.1 0.9]
Epoch 1/10
274/274 [==============================] - 1s 2ms/step - loss: 1.1921e-08 - precision_24: 0.1092 - recall_24: 0.4119 - val_loss: 1.4074e-08 - val_precision_24: 0.1247 - val_recall_24: 0.3953
Epoch 2/10
274/274 [==============================] - 0s 1ms/step - loss: 1.1921e-08 - precision_24: 0.1092 - recall_24: 0.4119 - val_loss: 1.4074e-08 - val_precision_24: 0.1247 - val_recall_24: 0.3953
Epoch 3/10
274/274 [==============================] - 0s 1ms/step - loss: 1.1921e-08 - precision_24: 0.1092 - recall_24: 0.4119 - val_loss: 1.4074e-08 - val_precision_24: 0.1247 - val_recall_24: 0.3953
Epoch 4/10
274/274 [==============================] - 0s 969us/step - loss: 1.1921e-08 - precision_24: 0.1092 - recall_24: 0.4119 - val_loss: 1.4074e-08 - val_precision_24: 0.1247 - val_recall_24: 0.3953
[...]

输入数据集的图像和真正的y变量类名称: 输入数据集(17480 X 20)矩阵:

Image of the input data set and the true y variable class designation: Input Dataset a (17480 X 20) matrix:

y是尺寸为(17480 x 1)的输出数组(2个类),总数为1,是:1748(我要预测的类)

y is the output array (2 classes) with dimensions (17480 x 1) and total number of 1's is: 1748 (the class that I want to predict)

推荐答案

由于没有MWE,因此很难确定.为了尽可能地具有教育性,我将提出一些观察和评论.

Since there is no MWE present it's rather difficult to be sure. In order to be as educative as possible I'll lay out some observations and remarks.

第一个观察结果是您的自定义损失函数的值非常小,即在整个训练过程中为~10e-8.这似乎在告诉您的模型,性能实际上已经非常好,而实际上,在查看您选择的指标时却不是.这表明问题出在输出附近,或者与损失函数有关.我的建议是,由于您遇到分类问题,因此可以查看有关加权交叉熵[1]的文章.

The first observation is that your custom loss function has really small values i.e. ~10e-8 throughout training. This seems to tell your model that performance is already really good while in fact, when looking at the metrics you chose, it isn't. This indicates that the problem resides near the output or has something to do with the loss function. My recommendation here is since you have a classification problem to have a look at this post regarding weighted cross-entropy [1].

第二个观察结果是,您似乎没有模型性能的基准.通常,机器学习工作流程从非常简单的模型变为复杂的模型.我建议尝试使用简单的Logistic回归[2]以获得最低性能的想法.在此之后,我将尝试一些更复杂的模型,例如树增强器(XGBoost/LightGBM/...)或随机森林.尤其要考虑到您正在为表格数据使用成熟的神经网络,而该数据只有大约20个数字特征,而这些特征通常仍在传统的机器学习领域.

Second observation is that it seems you don't have a benchmark for performance of your model. In general, ML workflow goes from very simple to complex models. I would recommend trying a simple Logistic Regression [2] to get an idea for minimal performance. After this I would try some more complex models such as tree booster (XGBoost/LightGBM/...) or a random forest. Especially considering you are using a full-blown neural network for tabular data with only about 20 numerical features that tends to still be in the traditional machine learning territory.

使用标准的机器学习技术获得基线并可能获得改进的性能后,您可以再次关注神经网络.根据传统方法的结果,还有一些其他建议:

Once you have obtained a baseline and perhaps improved performance using a standard machine learning technique, you can look towards a neural network again. Some other recommendations depending on the results of the traditional approaches are:

  • 尝试一些优化器,并根据不同的学习率对它们进行交叉验证.

  • Try several and optimizers and cross-validate them over different learning rates.

像@TyQuangTu所提到的,尝试一些更简单,更浅的体系结构.

Try, as mentioned by @TyQuangTu, some simpler and shallower architectures.

尝试使用不具有垂死神经元"的激活功能.诸如LeakyRelu或ELU之类的问题.

Try an activation function that does not have the "dying neuron" problems such as LeakyRelu or ELU.

希望这个答案可以为您提供帮助,如果您还有其他问题,我很乐意为您提供帮助.

Hopefully this answer can help you and if you have any more questions I am glad to help.

[1] 不平衡数据和加权交叉熵

[2] https://scikit-learn. org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html

这篇关于自定义损失功能没有随着时间的推移而改善的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆