为什么 tf.assign() 会减慢执行时间? [英] Why does tf.assign() slow the execution time?
问题描述
今天我在 Tensorflow 中为我的 LSTM 添加了学习率衰减.
Today I add a learning rate decay to my LSTM in Tensorflow.
我变了
train_op = tf.train.RMSPropOptimizer(lr_rate).minimize(loss)
到
lr = tf.Variable(0.0,trainable=False)
并运行每个火车步骤
sess.run(tf.assign(lr, lr_rate*0.9**epoch))
但是,此更改将执行时间从约 7 分钟增加到超过约 20 分钟.
However, this change increases the execution time from ~7 minutes to over ~20 minutes.
我的问题是:为什么此更改会增加执行时间?
My question is: Why does this change increase the execution time?
一个明显的解决方法是每 1000 次迭代才进行一次分配.但是,我想了解这背后的原因.
An obvious work-around is to do the assignment only every 1000 iterations. However, I'd like to understand the reasoning behind this.
- sess.run() 是否需要额外的时间?
- tf.asign() 是否需要额外的时间?
- 我可以用另一种更有效的方式实现这个 tf.assign() 吗?
推荐答案
您遇到的问题与 sess.run
或 tf.assign
无关.这是许多模型中非常流行的问题,并且由于图形臃肿,您的模型速度很慢.我将在一个与您的代码无关的非常简单的示例中解释所有这些意味着什么.看看这两个片段:
The problem you have has nothing to do with sess.run
or tf.assign
. This is a very popular issue in many models and your model is slow because of your bloated graph. I will explain what all of this mean on a very simple example that has nothing to do with your code. Take a look at these 2 snippets:
片段 1
a = tf.Variable(1, name='a')
b = tf.Variable(2, name='b')
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for _ in range(3):
print sess.run(tf.add(a, b)),
代码段 2
a = tf.Variable(1, name='a')
b = tf.Variable(2, name='b')
res = tf.add(a, b)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for _ in range(3):
print sess.run(res),
<小时>
他们都返回相同的值,看起来他们都做同样的事情.问题是它们创建了不同的图,如果你在循环后print len(tf.get_default_graph().get_operations())
,你会看到代码段 1 比代码段 2 有更多的节点.增加范围到几千,差异会很大.
Both of them return the same values and looks like they both do the same stuff. The problem is that they create different graphs and if you will print len(tf.get_default_graph().get_operations())
after the loop, you will see that Snippet 1 has more nodes than Snippet 2. Increase the range to a few thousand and the difference will be significant.
您对臃肿的图形也有同样的问题.因为在循环的每次迭代中,您都会在图中创建 tf.assign(lr, lr_rate*0.9**epoch)
3 个节点.将图形定义与图形运行分开移动,您将看到改进.
You have the same problem with a bloated graph. Because in each iteration of the loop you create tf.assign(lr, lr_rate*0.9**epoch)
3 nodes in the graph. Move your graph definition separately from the graph run and you will see the improvement.
这篇关于为什么 tf.assign() 会减慢执行时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!