Keras Nan计算损失时的价值 [英] Keras Nan value when computing the loss
问题描述
我的问题与这一个
I am working to implement the method described in the article https://drive.google.com/file/d/1s-qs-ivo_fJD9BU_tM5RY8Hv-opK4Z-H/view . The final algorithm to use is here (it is on page 6):
- d是单位向量
- xhi是一个非空数字
- D是损失函数(在我的情况下是稀疏的交叉熵)
想法是进行对抗训练,方法是在网络对小变化最敏感的方向上修改数据,并使用修改后的数据但与原始数据具有相同的标签来训练网络.
The idea is to do an adversarial training, by modifying the data in the direction where the network is the most sensible to small changes and training the network with the modified data but with the same label as the original data.
用于训练模型的损失函数在这里:
The loss function used to train the model is here:
- l是对标记数据的损失测度
- Rvadv是算法1图片中渐变内部的值
- 文章选择了alpha = 1
想法是将标记数据集的模型性能纳入损失中
The idea is to incorporate the performances of the model for the labelled dataset in the loss
我正在尝试使用MNIST数据集和100个数据的小批量在Keras中实现此方法.当我尝试进行最终的梯度下降以更新权重时,经过几次迭代后,出现了Nan值,但我不知道为什么.我在协作会议上发布了笔记本(我 不要花多少时间,所以我也将代码张贴在要点中):
I am trying to implement this method in Keras with the MNIST dataset and a mini-batch of 100 data. When I tried to do the final gradient descent to update the weights, after some iterations I have Nan values that appear, and I don't know why. I posted the notebook on a collab session (I don't for how much time it will stand so I also post the code in a gist):
- collab会话: https://colab.research.google. com/drive/1lowajNWD-xvrJDEcVklKOidVuyksFYU3?usp = sharing
- gist: https://gist.github.com/DridriLaBastos/e82ec90bd699641124170d07e5a8ae4c
推荐答案
It's kind of stander problem of NaN
in training, I suggest you read this answer about issue NaN
with Adam solver for the cause and solution in common case.
基本上,我只是做了以下两项更改,并且在渐变中没有NaN
的情况下运行了代码:
Basically I just did following two change and code running without NaN
in gradients:
-
将优化器中的学习率从
model.compile
降低到optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3)
,
将C = [loss(label,pred) for label, pred in zip(yBatchTrain,dumbModel(dataNoised,training=False))]
替换为C = loss(yBatchTrain,dumbModel(dataNoised,training=False))
如果您仍然遇到这种错误,那么您可以尝试的下一件事情是:
If you still have this kind of error then the next few thing you could try is:
- 限制损耗或梯度
- 将所有张量从
tf.float32
切换到tf.float64
- Clip the loss or gradient
- Switch all tensor from
tf.float32
totf.float64
下次遇到此类错误时,可以使用 tf. debugging.check_numerics 查找NaN
Next time when you facing this kind of error, you could using tf.debugging.check_numerics to find root cause of the NaN
这篇关于Keras Nan计算损失时的价值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!