如何在TensorFlow的循环中减少内存消耗? [英] How I reduce memory consumption in a loop in TensorFlow?
问题描述
我在TensorFlow中有一个循环,如下所示:
I have a loop in TensorFlow that looks like this:
with tf.device("/gpu:1"):
losses = []
for target, output in zip(targets, lstm_outputs):
logits = tf.matmul(W, output) + b
loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits, target)
losses.append(loss)
total_loss = tf.add_n(losses)
为该层分配梯度时,出现OOM错误,因为每个矩阵乘法在占用内存的图形中是不同的运算.有没有一种方法可以防止TensorFlow同时分配所有这些操作?
I am getting an OOM error when allocating the gradients for this layer, since each matrix multiplication is a different operation in the graph taking memory. Is there a way of preventing TensorFlow from allocating all these operations at the same time?
推荐答案
这是TensorFlow要优化的极富挑战性的图,因为必须保持每一层的激活以聚集W
的单个梯度.一种可能性是在以下情况下传递 experimental aggregation_method
参数调用 optimizer.optimize()
.
This is a challenging graph for TensorFlow to optimize, since the activations from each layer must be kept to aggregate a single gradient for W
. One possibility is to pass the experimental aggregation_method
argument when calling optimizer.optimize()
.
例如,您可以尝试以下操作:
For example, you could try the following:
optimizer = tf.train.AdagradOptimizer(...) # Or another optimization algorithm.
train_op = optimizer.minimize(
total_loss,
aggregation_method=tf.AggregationMethod.EXPERIMENTAL_ACCUMULATE_N)
此选项会急切地将循环使用的变量的梯度汇总到位,而不是将其全部保留在内存中,直到计算出所有梯度为止.如果这不起作用,则tf.AggregationMethod.EXPERIMENTAL_TREE
可能会更好.
This option eagerly aggregates the gradients for recurrently-used variables in place, rather than keeping them all in memory until all of the gradients have been computed. If this doesn't work, the tf.AggregationMethod.EXPERIMENTAL_TREE
may work better.
这篇关于如何在TensorFlow的循环中减少内存消耗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!