处理主要损失(辅助损失存在)的缺失数据 [英] Handling missing data for the main loss, which is present for auxiliary loss

查看:186
本文介绍了处理主要损失(辅助损失存在)的缺失数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想为具有主要目标和辅助目标的数据集构建Keras模型.对于数据集中的所有条目,我都有辅助目标的数据,但是对于主要目标,我仅具有所有数据点的子集的数据.考虑下面的示例,该示例应该可以预测

I want to construct a Keras model for a dataset with a main target and an auxiliary target. I have data for the auxiliary target for all entries in my dataset, but for the main target I have data only for a subset of all data points. Consider the following example, which is supposed to predict

max(min(x1, x2), x3)

但是对于某些值,仅给定我的辅助目标min(x1, x2).

but for some values it is only given my auxiliary target, min(x1, x2).

from keras.models import Model
from keras.optimizers import Adadelta
from keras.losses import mean_squared_error
from keras.layers import Input, Dense

import tensorflow as tf
import numpy

input = Input(shape=(3,))

hidden = Dense(2)(input)
min_pred = Dense(1)(hidden)
max_min_pred = Dense(1)(hidden)

model = Model(inputs=[input],
              outputs=[min_pred, max_min_pred])

model.compile(
    optimizer=Adadelta(),
    loss=mean_squared_error,
    loss_weights=[0.2, 1.0])

def random_values(n, missing=False):
    for i in range(n):
        x = numpy.random.random(size=(4, 3))
        _min = numpy.minimum(x[..., 0], x[..., 1])
        if missing:
            _max_min = numpy.full((len(x), 1), numpy.nan)
        else:
            _max_min = numpy.maximum(_min, x[..., 2]).reshape((-1, 1))
        yield x, [numpy.array(_min).reshape((-1, 1)), numpy.array(_max_min)]

model.fit_generator(random_values(50, False),
                    steps_per_epoch=50)
model.fit_generator(random_values(5, True),
                    steps_per_epoch=5)
model.fit_generator(random_values(50, False),
                    steps_per_epoch=50)

显然,上面的代码不起作用-目标NaN意味着NaN的损失,这意味着NaN的权重适应,因此权重归因于NaN,该模型变得无用. (此外,实例化整个NaN数组是浪费的,但原则上我丢失的数据可以是存在数据的任何批次的一部分,因此,为了拥有同质的数组,这似乎是合理的.)

Obviously, the code above does not work – having a target of NaN means a loss of NaN which means a weight adaption of NaN, so weights go to NaN and the model becomes useless. (Also, instantiating the entire NaN array is wasteful, but in principle my missing data can be part of any batch with data present, so for the sake of having homogenous arrays it seems reasonable.)

我的代码不必与所有keras后端一起使用,仅tensorflow的代码可以.我尝试过更改损失函数,

My code does not have to work with all keras backends, tensorflow-only code is fine. I have tried changing the loss function,

def loss_0_where_nan(loss_function):
    def filtered_loss_function(y_true, y_pred):
        with_nans = loss_function(y_true, y_pred)
        nans = tf.is_nan(with_nans)
        return tf.where(nans, tf.zeros_like(with_nans), with_nans)
    return filtered_loss_function

并使用loss_0_where_nan(mean_squared_error)作为新的损失函数,但它仍然引入了NaNs .

and using loss_0_where_nan(mean_squared_error) as new loss function, but it still introduces NaNs.

在有辅助目标数据的情况下,应如何处理主预测输出中丢失的目标数据? 屏蔽会有所帮助吗?

How should I handle missing target data for the main prediction output where I have auxiliary target data? Will masking help?

推荐答案

在您的问题中,您提出了以下情况:数据集中的可预测块中缺少数据.如果您可以将丢失的数据与现有的数据分开,则可以使用

In your question, you present the case where missing data comes in predictable chunks in your dataset. If you can separate out missing data and existing data like that, you can use

truncated_model = Model(inputs=[input],
                        outputs=[min_pred])

truncated_model.compile(
    optimizer=Adadelta(),
    loss=[mean_squared_error])

定义一个与您的完整模型共享某些图层的模型,然后替换

to define a model that shares some layers with your complete model, and then replace

model.fit_generator(random_values(5, True),
                    steps_per_epoch=5)

使用

def partial_data(entry):
   x, (y0, y1) = entry
   return x, y0

truncated_model.fit_generator(map(partial_data, random_values(5, True)),
                              steps_per_epoch=5)

在不丢失的数据上训练截断的模型.

to train the truncated model on the non-missing data.

鉴于对输入数据提供者的这种控制级别,您显然可以调整random_values方法,使其甚至不生成partial_data立即丢掉的数据,但是我认为这是更清晰的方法提出必要的更改.

Given this level of control over your input data providers, you can obviously adapt your random_values method such that it does not even generate the data that partial_data immediately throws away again, but I thought this would be the clearer way to present the necessary changes.

这篇关于处理主要损失(辅助损失存在)的缺失数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆