什么时候在keras中使用sample_weights是合适的? [英] When is it appropriate to use sample_weights in keras?

查看:70
本文介绍了什么时候在keras中使用sample_weights是合适的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据此问题,我了解到 keras 中的 class_weight 在训练过程中施加了加权损失,如果我没有相等的值, sample_weight 正在做一些明智的采样操作对所有训练样本充满信心.

According to this question, I learnt that class_weight in keras is applying a weighted loss during training, and sample_weight is doing something sample-wise if I don't have equal confidence in all the training samples.

所以我的问题是

  1. 验证期间的损失是由 class_weight 加权还是仅在训练期间加权?
  2. 我的数据集有2个类,而我实际上并没有严重失衡的类分配.该比率是大约.1.7:1.是否有必要使用 class_weight 平衡损失或什至使用过采样?将稍微不平衡的数据保留为通常的数据集可以吗?
  3. 我可以简单地将 sample_weight 视为我给每个火车样本的权重吗?而且我的trainig样本可以得到相同的置信度,因此我可能不需要使用它.
  1. Is the loss during validation weighted by the class_weight, or is it only weighted during training?
  2. My dataset has 2 classes, and I don't actually have a seriously imbalanced class ditribution. The ratio is approx. 1.7 : 1. Is that neccessary to use class_weight to balance the loss or even use oversampling? Is that OK to leave the slightly imbalanced data as the usual dataset treated?
  3. Can I simply consider sample_weight as the weights I give to each train sample? And my trainig samples can be treated with equal confidence, so I probably I don't need to use this.

推荐答案

  1. 从keras文档中说

class_weight :可选的字典,将类索引(整数)映射到权重(浮点)值,用于对损失函数加权(仅在训练过程中).这可能有助于告诉模型更多关注"来自代表性不足的类的样本.

class_weight: Optional dictionary mapping class indices (integers) to a weight (float) value, used for weighting the loss function (during training only). This can be useful to tell the model to "pay more attention" to samples from an under-represented class.

所以 class_weight 仅影响转换期间的损失.我本人对了解在测试和培训期间如何处理班级和样本权重感兴趣.查看keras github存储库以及度量和损失的代码,似乎损失或度量不受它们的影响.打印的值很难在训练代码(如 model.fit())及其相应的tensorflow后端训练功能中进行跟踪.因此,我决定编写一个测试代码来测试可能的情况,请参见下面的代码.结论是, class_weight sample_weight 都仅影响训练损失,对任何度量标准或验证损失均无影响.有点惊讶,因为 val_sample_weights (您可以指定)似乎什么也没做(??).

So class_weight does only affect the loss during traning. I myself have been interested in understanding how the class and sample weights is handled during testing and training. Looking at the keras github repo and the code for metric and loss, it does not seem that either loss or metric is affected by them. The printed values are quite hard to track in the training code like model.fit() and its corresponding tensorflow backend training functions. So I decided to make a test code to test the possible scenarios, see code below. The conclusion is that both class_weight and sample_weight only affect training loss, no effect on any metrics or validation loss. A little surprising as val_sample_weights (which you can specify) seems to do nothing(??).

  1. 这种类型的问题始终取决于您的问题,日期有多偏斜以及以何种方式尝试优化模型.您是否在为准确性进行优化,那么只要训练数据与模型投入生产时一样偏斜,就可以在没有任何高/低采样和/或类权重的情况下进行训练,从而获得最佳结果.另一方面,如果您认为某类比另一类重要(或昂贵),那么您应该对数据进行加权.例如,在欺诈预防中,欺诈通常比非欺诈收入要昂贵得多.我建议您尝试使用非加权类,加权类以及一些欠采样/过采样,然后检查一下是否可以提供最佳的验证结果.使用验证功能(或编写自己的验证功能),该功能最好地比较不同的模型(例如,根据成本对真假,假假,真负假和负假加权).在偏斜数据的kaggle比赛中显示出很大结果的相对较新的损失函数是 Focal-loss .局部损耗减少了对过采样/欠采样的需求.不幸的是, Focal-loss 尚不是喀拉拉邦的内置旅馆功能,但可以手动编程.

  1. This types of question always depends on you problem, how skewed the date is and in what way you try to optimize the model. Are you optimizing for accuracy, then as long as the training data is equally skewed as when the model is in production, the best result will be achieved just training without any over/under sampling and/or class weights. If you on the other hand have something where one class is more important (or expensive) than another then you should be weighting the data. For example in fraud prevention, where fraud normally is much more expensive than the income of non-fraud. I would suggest you try out unweighted classes, weighted classes and some under/over-sampling and check which gives the best validation results. Use a validation function (or write your own) that best will compare different models (for-example weighting true-positive, false-positive, true-negative and false-negative differently dependent on cost). A relatively new loss-function that has shown great result at kaggle competitions on skewed data is Focal-loss. Focal-loss reduce the need for over/under-sampling. Unfortunately Focal-loss is not a built inn function in keras (yet), but can be manually programmed.

是的,我认为您是正确的.我通常使用 sample_weight 有两个原因.如图1所示,训练数据具有某种测量不确定性,如果知道的话,可以将其用于权重准确数据,而不是不准确的测量.或者2,我们可以对新数据进行加权,而不是对旧数据进行加权,从而迫使模型能够更快地适应新行为,而不会忽略有价值的旧数据.

Yes I think you are correct. I normally use sample_weight for two reasons. 1, the training data have some kind of measuring uncertainty, which if known can be used to weight accurate data more than inaccurate measurements. Or 2, we can weight newer data more than old, forcing the model do adapt to new behavior more quickly, without ignoring valuable old data.

用于在有模型和其他所有内容保持静态的情况下与是否使用 class_weights sample_weights 进行比较的代码.

The code for comparing with and without class_weights and sample_weights, while holding the model and everything else static.

import tensorflow as tf
import numpy as np

data_size = 100
input_size=3
classes=3

x_train = np.random.rand(data_size ,input_size)
y_train= np.random.randint(0,classes,data_size )
#sample_weight_train = np.random.rand(data_size)
x_val = np.random.rand(data_size ,input_size)
y_val= np.random.randint(0,classes,data_size )
#sample_weight_val = np.random.rand(data_size )

inputs = tf.keras.layers.Input(shape=(input_size))
pred=tf.keras.layers.Dense(classes, activation='softmax')(inputs)

model = tf.keras.models.Model(inputs=inputs, outputs=pred)

loss = tf.keras.losses.sparse_categorical_crossentropy
metrics = tf.keras.metrics.sparse_categorical_accuracy

model.compile(loss=loss , metrics=[metrics], optimizer='adam')

# Make model static, so we can compare it between different scenarios
for layer in model.layers:
    layer.trainable = False

# base model no weights (same result as without class_weights)
# model.fit(x=x_train,y=y_train, validation_data=(x_val,y_val))
class_weights={0:1.,1:1.,2:1.}
model.fit(x=x_train,y=y_train, class_weight=class_weights, validation_data=(x_val,y_val))
# which outputs:
> loss: 1.1882 - sparse_categorical_accuracy: 0.3300 - val_loss: 1.1965 - val_sparse_categorical_accuracy: 0.3100

#changing the class weights to zero, to check which loss and metric that is affected
class_weights={0:0,1:0,2:0}
model.fit(x=x_train,y=y_train, class_weight=class_weights, validation_data=(x_val,y_val))
# which outputs:
> loss: 0.0000e+00 - sparse_categorical_accuracy: 0.3300 - val_loss: 1.1945 - val_sparse_categorical_accuracy: 0.3100

#changing the sample_weights to zero, to check which loss and metric that is affected
sample_weight_train = np.zeros(100)
sample_weight_val = np.zeros(100)
model.fit(x=x_train,y=y_train,sample_weight=sample_weight_train, validation_data=(x_val,y_val,sample_weight_val))
# which outputs:
> loss: 0.0000e+00 - sparse_categorical_accuracy: 0.3300 - val_loss: 1.1931 - val_sparse_categorical_accuracy: 0.3100

使用权重与不使用权重之间存在一些小的偏差(即使所有权重都为一个),这可能是由于对加权和非加权数据使用不同的后端函数进行拟合或由于舍入误差造成的吗?

There are some small deviations between using weights and not (even when all weights are one), possible due to fit using different backend functions for weighted and unweighted data or due to rounding error?

这篇关于什么时候在keras中使用sample_weights是合适的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆