Tensorflow密集梯度解释? [英] Tensorflow dense gradient explanation?

查看:42
本文介绍了Tensorflow密集梯度解释?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近实现了一个模型,当我运行它时,我收到了这个警告:

I recently implemented a model and when I ran it I received this warning:

UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. 
This may consume a large amount of memory.
"Converting sparse IndexedSlices to a dense Tensor of unknown shape. "

使用一些类似的参数设置(嵌入维度),模型突然变得非常慢.

With some similar parameter settings (embedding dimensionalities) suddenly the model is ridiculously slow.

  1. 这个警告意味着什么?看来我所做的事情导致所有梯度都变得密集,因此反向传播正在进行密集矩阵计算
  2. 如果是模型存在问题导致了这种情况,我该如何识别并修复它?

推荐答案

当一个稀疏的 tf.IndexedSlices 对象被隐式转换为密集的 tf.Tensor.这通常发生在一个操作(通常是 tf.gather()) 反向传播稀疏梯度,但接收它的 op 没有可以处理稀疏梯度的专用梯度函数.因此,TensorFlow 会自动增密 tf.IndexedSlices,如果张量很大,这会对性能产生破坏性影响.

This warning is printed when a sparse tf.IndexedSlices object is implicitly converted to a dense tf.Tensor. This typically happens when one op (usually tf.gather()) backpropagates a sparse gradient, but the op that receives it does not have a specialized gradient function that can handle sparse gradients. As a result, TensorFlow automatically densifies the tf.IndexedSlices, which can have a devastating effect on performance if the tensor is large.

要解决此问题,您应该尝试确保 tf.gather()params 输入(或 params 输入到tf.nn.embedding_lookup()) 是一个 tf.Variable.变量可以直接接收稀疏更新,所以不需要转换.尽管 tf.gather()(和 tf.nn.embedding_lookup())接受任意张量作为输入,但这可能会导致更复杂的反向传播图,从而导致隐式转换.

To fix this problem, you should try to ensure that the params input to tf.gather() (or the params inputs to tf.nn.embedding_lookup()) is a tf.Variable. Variables can receive the sparse updates directly, so no conversion is needed. Although tf.gather() (and tf.nn.embedding_lookup()) accept arbitrary tensors as inputs, this may lead to a more complicated backpropagation graph, resulting in implicit conversion.

这篇关于Tensorflow密集梯度解释?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆