Tensorflow密集梯度解释? [英] Tensorflow dense gradient explanation?
问题描述
我最近实现了一个模型,当我运行它时,我收到了这个警告:
I recently implemented a model and when I ran it I received this warning:
UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape.
This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
使用一些类似的参数设置(嵌入维度),模型突然变得非常慢.
With some similar parameter settings (embedding dimensionalities) suddenly the model is ridiculously slow.
- 这个警告意味着什么?看来我所做的事情导致所有梯度都变得密集,因此反向传播正在进行密集矩阵计算
- 如果是模型存在问题导致了这种情况,我该如何识别并修复它?
推荐答案
当一个稀疏的 tf.IndexedSlices
对象被隐式转换为密集的 tf.Tensor
.这通常发生在一个操作(通常是 tf.gather()
) 反向传播稀疏梯度,但接收它的 op 没有可以处理稀疏梯度的专用梯度函数.因此,TensorFlow 会自动增密 tf.IndexedSlices
,如果张量很大,这会对性能产生破坏性影响.
This warning is printed when a sparse tf.IndexedSlices
object is implicitly converted to a dense tf.Tensor
. This typically happens when one op (usually tf.gather()
) backpropagates a sparse gradient, but the op that receives it does not have a specialized gradient function that can handle sparse gradients. As a result, TensorFlow automatically densifies the tf.IndexedSlices
, which can have a devastating effect on performance if the tensor is large.
要解决此问题,您应该尝试确保 tf.gather()
的 params
输入(或 params
输入到tf.nn.embedding_lookup()
) 是一个 tf.Variable
.变量可以直接接收稀疏更新,所以不需要转换.尽管 tf.gather()
(和 tf.nn.embedding_lookup()
)接受任意张量作为输入,但这可能会导致更复杂的反向传播图,从而导致隐式转换.
To fix this problem, you should try to ensure that the params
input to tf.gather()
(or the params
inputs to tf.nn.embedding_lookup()
) is a tf.Variable
. Variables can receive the sparse updates directly, so no conversion is needed. Although tf.gather()
(and tf.nn.embedding_lookup()
) accept arbitrary tensors as inputs, this may lead to a more complicated backpropagation graph, resulting in implicit conversion.
这篇关于Tensorflow密集梯度解释?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!