非确定性梯度计算 [英] Non-deterministic Gradient Computation

查看：97 发布时间：2020/7/1 19:47:02 tensorflow non-deterministic

本文介绍了非确定性梯度计算的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我意识到，即使我将TensorFlow随机种子保持不变，每次训练它们的模型最终都会有所不同.

I realized that my models end up being different every time I train them, even though I keep the TensorFlow random seed the same.

我确认:

初始化是确定性的；在第一次更新之前，权重是相同的.
输入是确定性的.实际上，对于第一批产品，包括损失在内的各种正向计算都是相同的.
第一批的梯度不同.具体来说，我正在比较tf.gradients(loss, train_variables)的输出.虽然loss和train_variables具有相同的值，但某些变量的梯度有时有时不同.差异非常大(有时单个变量的梯度的绝对差异之和大于1).

Initialization is deterministic; the weights are identical before the first update.
Inputs are deterministic. In fact, various forward computations, including the loss, are identical for the very first batch.
The gradients for the first batch are different. Concretely, I'm comparing the outputs of tf.gradients(loss, train_variables). While loss and train_variables have identical values, the gradients are sometimes different for some of the Variables. The differences are quite significant (sometimes the sum-of-absolute-differences for a single variable's gradient is greater than 1).

我得出结论，这是导致不确定性的梯度计算. 我查看了此问题，当在具有intra_op_parallelism_thread=1和inter_op_parallelism_thread=1.

I conclude that it's the gradient computation that causes the non-determinism. I had a look at this question and the problem persists when running on a CPU with intra_op_parallelism_thread=1 and inter_op_parallelism_thread=1.

如果没有正向通行，反向通行如何不确定?我该如何进一步调试呢?

How can the backward pass be non-deterministic when the forward pass isn't? How could I debug this further?

非确定性梯度计算 [英] Non-deterministic Gradient Computation

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

非确定性梯度计算 [英] Non-deterministic Gradient Computation

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭