在 Keras 序列模型中使用哪个损失函数 [英] Which loss function to use in Keras Sequential Model

查看:37
本文介绍了在 Keras 序列模型中使用哪个损失函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的是 Keras 序列模型,预测输出的形状为 (1, 5)(5 个特征).

I am using a Keras sequential model, a prediction output is of the shape (1, 5) (5 features).

我有一个准确度指标定义如下:

I have an accuracy metric defined as follows:

对于 N 个预测,模型的准确度将是预测样本的百分比,使得:对于每个预测及其各自的真实标签,所有特征的差异不超过 10.

For N predictions, the accuracy of the model will be the percentage of predicted samples such that: for each prediction and its respective true labels, all of the features are with no more than 10 difference.

例如,如果 y_i = [1, 2, 3, 4, 5]ypred_i = [1, 2, 3, 4, 16] 不是自最后一个特征以来的匹配差异为 11.如果 y_i = [1, 2, 3, 4, 5]ypred_i = [10, 8, 0, 5, 7] 是匹配的,因为所有特征与其各自真实特征的差异不超过 10.

For example, if y_i = [1, 2, 3, 4, 5] and ypred_i = [1, 2, 3, 4, 16] is not a match since the last feature has difference 11. If y_i = [1, 2, 3, 4, 5] and ypred_i = [10, 8, 0, 5, 7] is a match, because all features have no more than 10 difference to its respective real features.

我想知道在我的 Keras 序列模型中使用哪个损失函数来最大程度地提高解释的准确性.我应该定义一个自定义损失函数,它应该是什么样子,或者我应该如何进行?

I am wondering which loss function to use in my Keras sequential model as to increase the explained accuracy the most. Should I define a custom loss function, how should it look like, or how should I proceed?

我的代码是:

class NeuralNetMulti(Regressor):
    def __init__(self):
        self.name = 'keras-sequential'
        self.model = Sequential()
        # self.earlystopping = callbacks.EarlyStopping(monitor="mae",
        #                                              mode="min", patience=5,
        #                                              restore_best_weights=True)

    def fit(self, X, y):
        print('Fitting into the neural net...')
        n_inputs = X.shape[1]
        n_outputs = y.shape[1]
        self.model.add(Dense(400, input_dim=n_inputs, kernel_initializer='he_uniform', activation='relu'))
        # self.model.add(Dense(20, activation='relu'))
        self.model.add(Dense(200, activation='relu'))
        # self.model.add(Dense(10, activation='relu'))
        self.model.add(Dense(n_outputs))
        self.model.summary()
        self.model.compile(loss='mae', optimizer='adam', metrics=['mse', 'mae', 'accuracy'])
        history = self.model.fit(X, y, verbose=1, epochs=200, validation_split=0.1)
        # self.model.fit(X, y, verbose=1, epochs=1000, callbacks=[self.earlystopping])
        print('Fitting completed!')

    def predict(self, X):
        print('Predicting...')
        predictions = self.model.predict(X, verbose=1)
        print('Predicted!')
        return predictions

我对损失函数的建议:

def N_distance(y_true, y_pred):
    score = 0
    vals = abs(y_true - y_pred)
    if all(a <= 10 for a in vals):
            return 0
return 1

它返回:

  • 0 如果条件成立
  • 1 否则.
  • 0 if the condition holds
  • 1 otherwise.

推荐答案

首先,您的损失需要是可微的,以便可以计算相对于权重的梯度.然后可以使用梯度来优化权重,这是基于梯度的优化算法(如梯度下降)的重点.如果你自己写损失,这是你需要记住的第一件事.这就是为什么你的损失不起作用.你需要重新考虑你的损失或整个问题.

First of all, your loss needs to be differentiable so that it is possible to compute the gradient with respect to the weights. Then it is possible to use the gradient to optimize the weights which is the whole point of gradient based optimization algorithms like Gradient Descent. If you write your own loss, this is the first thing you need to keep in mind. This is why your loss does not work. You need to rethink your loss or the whole problem.

接下来,不要忘记,您需要在损失中使用 keras 或 tensorflow 函数,因此所使用的函数具有定义的梯度并且可以应用链式法则.只使用 abs() 不是一个好主意.这个问题可能会指向正确的方向 https://ai.stackexchange.com/questions/26426/why-is-tf-abs-non-differentiable-in-tensorflow.

Next, do not forget, you need to use keras or tensorflow functions in your loss, so the used functions have the gradient defined and the chain rule can be applied. Using just abs() is not a good idea. This question might point you to the right direction https://ai.stackexchange.com/questions/26426/why-is-tf-abs-non-differentiable-in-tensorflow.

此外,从您的问题和评论中,我看到预期输出应该在 0 到 100 之间.在这种情况下,我会尝试缩放网络的输出和标签,使它们始终位于该范围内.有多种方法可以解决这个问题.将您的标签除以 100,然后在输出上使用 sigmoid 激活或检查例如这个答案 如何限制输出一个特定范围的神经网络?.

Furthermore, from your question and comments I see the expected output should be between 0 and 100. In that case, I would try to scale the output and the labels of the network such that they always lie in that range. There are multiple ways you can go about it. Divide your labels by 100 and either use sigmoid activation on the outputs and or check e.g. this answer How to restrict output of a neural net to a specific range?.

然后你就可以开始思考如何写下你的损失.根据您的描述,不清楚在这种情况下会发生什么: y_i = [1, 2, 3, 4, 100]pred = [1, 2, 3, 4, 110].110 的值是否仍然可以接受,即使它在理论上应该是不可能的?

Then you can start thinking how to write your loss. From your description it is not clear what would happen in this case: y_i = [1, 2, 3, 4, 100] and pred = [1, 2, 3, 4, 110]. Is the value 110 still acceptable even though it should not be possible in theory?

无论如何,您可以使用 maemse 作为损失.您的网络会尝试完美拟合,然后您可以使用特殊的不可微函数作为衡量您的网络根据您的规则进行训练的程度的指标.

Anyways, you can just use mae or mse as a loss. Your network would try to fit perfectly and then you can use your special non-differentiable function just as a metric to measure how well is your network trained according to your rules.

一个明确的例子:

  • 网络的最后一层需要像这样指定一个激活 self.model.add(Dense(n_outputs, activation='sigmoid')) 它将所有网络输出缩放到从 0 到 1 的间隔.
  • 由于您的标签是在 0 - 100 的区间内定义的,因此您只需要将标签划分为 0 到 1 的区间,然后才能在网络中使用 y \= 100.
  • 然后您可以将 maemse 用作损失,并将您的特殊函数用作度量.self.model.compile(loss='mae', optimizer='adam', metrics=[custom_metric])
  • The last layer of your network needs to have an activation specified like so self.model.add(Dense(n_outputs, activation='sigmoid')) which will scale all the network output to the interval from 0 to 1.
  • Since your labels are defined on an interval from 0 - 100, you just need to divide your labels to also be in the interval from 0 to 1 before using them in the network by y \= 100.
  • Then you can use mae or mse as a loss and your special function just as a metric. self.model.compile(loss='mae', optimizer='adam', metrics=[custom_metric])

custom_metric 函数可能如下所示:

def custom_metric(y_true, y_pred):
    valid_distance = 0.1
    valid = tf.abs(y_true - y_pred) <= valid_distance
    return tf.reduce_mean(tf.cast(tf.reduce_all(valid, axis=1), tf.float32))

这篇关于在 Keras 序列模型中使用哪个损失函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆