Keras Nan计算损失时的价值 [英] Keras Nan value when computing the loss

查看:57
本文介绍了Keras Nan计算损失时的价值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题与这一个

我正在努力实现文章

I am working to implement the method described in the article https://drive.google.com/file/d/1s-qs-ivo_fJD9BU_tM5RY8Hv-opK4Z-H/view . The final algorithm to use is here (it is on page 6):

  • d是单位向量
  • xhi是一个非空数字
  • D是损失函数(在我的情况下是稀疏的交叉熵)

想法是进行对抗训练,方法是在网络对小变化最敏感的方向上修改数据,并使用修改后的数据但与原始数据具有相同的标签来训练网络.

The idea is to do an adversarial training, by modifying the data in the direction where the network is the most sensible to small changes and training the network with the modified data but with the same label as the original data.

用于训练模型的损失函数在这里:

The loss function used to train the model is here:

  • l是对标记数据的损失测度
  • Rvadv是算法1图片中渐变内部的值
  • 文章选择了alpha = 1

想法是将标记数据集的模型性能纳入损失中

The idea is to incorporate the performances of the model for the labelled dataset in the loss

我正在尝试使用MNIST数据集和100个数据的小批量在Keras中实现此方法.当我尝试进行最终的梯度下降以更新权重时,经过几次迭代后,出现了Nan值,但我不知道为什么.我在协作会议上发布了笔记本(我 不要花多少时间,所以我也将代码张贴在要点中):

I am trying to implement this method in Keras with the MNIST dataset and a mini-batch of 100 data. When I tried to do the final gradient descent to update the weights, after some iterations I have Nan values that appear, and I don't know why. I posted the notebook on a collab session (I don't for how much time it will stand so I also post the code in a gist):

推荐答案

这是培训中NaN的典型问题,建议您阅读

It's kind of stander problem of NaN in training, I suggest you read this answer about issue NaN with Adam solver for the cause and solution in common case.

基本上,我只是做了以下两项更改,并且在渐变中没有NaN的情况下运行了代码:

Basically I just did following two change and code running without NaN in gradients:

  1. 将优化器中的学习率从model.compile降低到optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3)

C = [loss(label,pred) for label, pred in zip(yBatchTrain,dumbModel(dataNoised,training=False))]替换为C = loss(yBatchTrain,dumbModel(dataNoised,training=False))

如果您仍然遇到这种错误,那么您可以尝试的下一件事情是:

If you still have this kind of error then the next few thing you could try is:

  1. 限制损耗或梯度
  2. 将所有张量从tf.float32切换到tf.float64
  1. Clip the loss or gradient
  2. Switch all tensor from tf.float32 to tf.float64

下次遇到此类错误时,可以使用 tf. debugging.check_numerics 查找NaN

Next time when you facing this kind of error, you could using tf.debugging.check_numerics to find root cause of the NaN

这篇关于Keras Nan计算损失时的价值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆