Keras自定义损失函数错误“未提供渐变" [英] Keras custom loss function error "No gradients provided"

查看:133
本文介绍了Keras自定义损失函数错误“未提供渐变"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用基于TensorFlow 2.3.0的Keras训练网络.任务是创建新图片.在第一个简单的原型/概念验证中,我试图训练网络以仅具有给定数量的非黑色像素来创建图片.因此,我需要定义一个自定义损失函数.这样做,我得到了ValueError: No gradients provided for any variable,但我还无法解决.

I am trying to train a network with Keras based on TensorFlow 2.3.0. The task is to create new pictures. In a first simple prototype / proof of concept I am trying to train the network to create pictures just with a given amount of non-black pixel. Therefore I need to define a custom loss function. Doing so I get the ValueError: No gradients provided for any variable which I have not yet been able to solve.

最重要的是,我希望可以编写一种编码此损失函数的方法,而不必急于运行 (请参阅我之前的问题).

On top I would prefer a way to code this loss function without having to run eagerly (see my previous question).

def custom_loss(y_true, y_pred):
    ndarray = y_pred.numpy()
    mse = np.zeros(ndarray.shape[0])
    for i in range(ndarray.shape[0]):
        true_area = int(y_true[i][0] * 100000).numpy()
        pic = ndarray[i, :, :, :]
        img_np = (pic * 255).astype(np.uint8)
        img = tf.keras.preprocessing.image.array_to_img(img_np)
        count_area = count_nonblack_pil(img)
        mse[i] = ((count_area - true_area) / 100000)**2
        #img.save(f"custom_loss {i:03d} True {true_area:06d} Count {count_area:06d} MSE {mse:0.4f}.jpg")
    return mse

if __name__ == '__main__':
    tf.config.run_functions_eagerly(True)
    ...
    model.compile(loss=custom_loss, optimizer="adam", run_eagerly=True)
    model.fit(x=train_data, y=train_data, batch_size=16, epochs=10)

运行此代码会给我错误消息:

Running this code gives me the error message:

ValueError: No gradients provided for any variable: ['dense/kernel:0', 'dense/bias:0', 'conv2d/kernel:0', 'conv2d/bias:0', 'conv2d_1/kernel:0', 'conv2d_1/bias:0', 'conv2d_2/kernel:0', 'conv2d_2/bias:0', 'conv2d_3/kernel:0', 'conv2d_3/bias:0'].

到目前为止我尝试过的一切

错误听起来像损失函数不可微,但为什么不应该呢?

What I have tried so far

The error sounds like the loss function is not differentiable, but why shouldn't it be?

谷歌搜索解决方案,我发现了一个建议,即我可能有此处相同,但我已经通过保存一些带有标签的图片进行了检查(请参见上面代码中注释掉的行).这样就好了!

Googling for a solution I found the suggestion, that I might have missed to pass the labels, same here, but I already checked this by saving some pics with labels (see line commented out in the code above). This works just fine!

除此之外,我没有找到任何有用的提示,总之,谷歌点击量并不太多……(似乎是我想做的异国情调?).有什么想法吗?

Other than that I was not able to find any useful hint, all in all not too many google hits anyway ... (seems to be exotic what I am trying to do?). Any thoughts?

感谢您的快速反馈,对于未能非常清楚地描述损失函数的任务,我们深表歉意.

Thank you for your quick feedback and sorry for not describing the task of the loss function very clearly, let me give it another try:

我有一个基于单个浮点输入创建完整的533x800 RGB图片的模型,该输入作为y_true传递给损失函数.模型创建的图片也作为y_pred传递到损失函数.现在,损失函数调用一个小函数count_nonblack_pil来计算y_pred中非黑色像素的数量.然后,将损耗计算为y_true与计数像素之间的平方差.通过最小化这种差异,我希望对模型进行训练,以便能够创建具有接近输入值的多个非黑色像素的图片.并不是很有用,但是可以简单地概念证明我以后打算使用不同的损失函数(在这里我想使用其他已经训练有素的模型来计算更有用和更复杂的任务的损失).

I have a model that creates a full 533x800 RGB picture based on a single float input, which is passed on to the loss function as y_true. The picture created by the model is also passed on to the loss function as y_pred. The loss function now calls a small function count_nonblack_pil to count the number of non-black pixels in y_pred. The loss is then calculated as the squared difference between y_true and the counted pixels. By minimizing this difference I expect to train the model so that is able to create a picture with a number of non-black pixels close to the input value. Not really useful, but a simple proof of concept of what I plan to do later with different loss function (where I want to use other already trained models to calculate the loss for more useful and sophisticated tasks).

希望如此.为了更清楚一点:

Hope that makes sense. To make it more clear:

y_true size : 16
y_pred size : 20467200

y_pred包含16幅533x800图片,具有3种颜色,即20467200. y_true仅包含16个像素目标值.

y_pred contains 16 pictures of 533x800 with 3 colors, i.e. 20467200. y_true contains just the 16 target values of pixels.

我现在已经理解了这个问题, JimBiardCics 很好地总结了这个问题: ;请记住,您编写的python函数(custom_loss)被调用来生成和编译C函数.编译后的函数就是训练过程中所谓的函数.当您的python custom_loss函数被调用时,参数是没有附加数据的张量对象. K.eval调用将失败,K.shape调用也会失败."

I have now understood the problem, nicely summarized by JimBiardCics: "Keep in mind that the python function you write (custom_loss) is called to generate and compile a C function. The compiled function is what is called during training. When your python custom_loss function is called, the arguments are tensor objects that don't have data attached to them. The K.eval call will fail, as will the K.shape call."

推荐答案

将引发错误,因为这些操作不是图形的一部分,因此无法区分.您想要做的事情不需要Eager,而是一系列的tensorflow方法可以完成您想要的事情.

The error is thrown because these operations are not part of the graph and can therefore not be differentiated. What you're trying to do doesn't require Eager but a sequence of tensorflow methods doing what you want.

由于算法的确切细节有点模糊,因此我将提出一个更简单的解决方案.看来您的总体目标是生成相似的图像,但与原始图像相比减少黑色图像的数量. 您可以通过保留原始损失但增加罚款来做到这一点.我假设您需要MSE损失,但这没关系,因为您可以使用其他任何东西:

Since the exact specifics of your algorithm are a bit fuzzy, I'll propose a somewhat simpler solution. It seems that your overall goal is to generate similar images but reduce the amount of black images compared to the original. You can do this by retaining the original loss but adding a penalty. I'll assume you need the MSE loss but it doesn't matter as you can use any other:

Loss = alpha * MSE + beta * nr_of_black_pixels_in_pred 

alphabeta一起调整各自的影响力.这可以通过这样的损失来实现:

with alpha, beta adjusting the influence for each. This can be achieved in a loss like this:

def custom_loss(y_true, y_pred):
    alpha, beta = 0.8, 0.2 # percentage influence
    mse = tf.keras.losses.mean_squared_error(y_true, y_pred)
    count = tf.where(y_pred==0., tf.ones_like(y_pred), tf.zeros_like(y_pred))
    bp = tf.math.reduce_sum(count)
    return alpha * mse + beta * bp

好处是您现在甚至可以如果要包括黑色"字样,请说出y_pred<50..像素值.

The benefit is that you can now even e.g. say y_pred<50. if you want to include "blackish" pixel values.

为什么这样做有效,为什么这是一个更好的解决方案?如果我们只惩罚黑色像素,则网络可能会仅生成白色图像以获得最佳的损失(或将所有像素的值设置为0到1).这些作弊"都没有.解决方案可能是理想的.因此,我们需要保留原始损失以保留原始行为,并对其进行惩罚.

Why does this work and why is this a better solution? If we only penalized black pixels, the network could potentially just generate white images to get the best possible loss (or set all pixels with value 0 to 1). None of these "cheating" solutions are probably desirable. So we need to keep the original loss to retain the original behavior and a penalty to modify it.

附加的新惩罚现在会自动减少tf.where中布尔条件为真的频率.由于此数字和损失的比例完全不同,因此您可能必须另外对惩罚进行标准化. alphabeta也将需要耐心和根据经验进行优化.这些类型的参数只能在很小的范围内正常工作,您需要找到这些值.为此,我建议添加一个自定义指标,以打印出罚款占总损失的百分比.由于规模不同,可能有必要将beta设置为一个很小的数字(但这是高度特定于应用程序的.)

The additional new penalty now automatically reduces how often the boolean condition in tf.where is true. Since this number and the loss are on completely different scales, you might have to additionally normalize the penalty. alpha and beta will also have to be patiently and empirically optimized. These kinds of parameters have a very small range of values where they work properly, which you need to find. For this, I would recommend to add a custom metric to print out how much % of the total loss is caused by the penalty. Due to the scale differences, it might be necessary to put beta as a very small number (but this is highly application-specific).

这篇关于Keras自定义损失函数错误“未提供渐变"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆