更新时如何在 Tensorflow 中检查梯度中的 NaN? [英] How to check NaN in gradients in Tensorflow when updating?
问题描述
全部,
当你训练一个包含大量样本的大型模型时,一些样本在参数更新时可能会导致NaN梯度.
When you train a large model with large amount samples, some samples may be cause NaN gradient when parameter updating.
我想找出这些样本.同时我不希望这批样本的梯度更新模型的参数,因为这可能导致模型的参数为 NaN.
And I want to find these samples out. And meanwhile I don't want this batch samples' gradient to update model's parameter, because it may be cause model's parameter being NaN.
那么有没有人有解决这个问题的好主意?
So dose anyone have good idea to deal with this problem?
我的代码如下:
# Create an optimizer.
params = tf.trainable_variables()
opt = tf.train.AdamOptimizer(1e-3)
gradients = tf.gradients(self.loss, params)
max_gradient_norm = 10
clipped_gradients, self.gradient_norms = tf.clip_by_global_norm(gradients,
max_gradient_norm)
self.optimizer = opt.apply_gradients(zip(clipped_gradients, params))
推荐答案
你可以通过tf.check_numerics
来检查你的渐变是否有NaN:
You can check whether your gradients have NaN by tf.check_numerics
:
grad_check = tf.check_numerics(clipped_gradients)
with tf.control_dependencies([grad_check]):
self.optimizer = opt.apply_gradients(zip(clipped_gradients, params))
如果 clipped_gradients 为 NaN 或无穷大,grad_check
将抛出 InvalidArgument
.
The grad_check
would throw InvalidArgument
if clipped_gradients is NaN or infinity.
tf.control_dependencies
确保在应用渐变之前评估 grad_check
.
The tf.control_dependencies
makes sure that the grad_check
is evaluated before applying the gradients.
另见tf.add_check_numerics_ops()
.
这篇关于更新时如何在 Tensorflow 中检查梯度中的 NaN?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!