了解Tensorflow控制依赖项 [英] Understanding Tensorflow control dependencies
问题描述
我试图更深入地了解TensorFlow.我遇到了控件依赖项的概念.我了解我们指定的操作顺序在执行期间与Tensorflow并没有真正的关系.为了优化执行速度,TensorFlow决定了自己的计算节点顺序. 但是我们可以使用tf.control_dependencies自定义执行顺序. 我无法理解该功能的用例.任何人都可以将我引至某些资源(文档除外)或解释此功能的工作吗? 一个例子:
I am trying to gain a stronger grasp of TensorFlow. I came across the concept of control dependencies. I understand that the order of ops as specified by us is not really relevant to Tensorflow during execution. In order to optimise the speed of execution TensorFlow decides its own order of calculating nodes. But we can customise order of execution by using tf.control_dependencies. I am not able to understand the use cases of the function. Can anyone direct me to some resource(other than the documentation) or explain the working of this function? An example:
tf.reset_default_graph()
x = tf.Variable(5)
y=tf.Variable(3)
assign = tf.assign(x,x+y)
z = x+assign
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
with tf.control_dependencies([assign]):
z_out = sess.run(z)
print(z_out)
代码的输出为8.因此,我推断由于z = x + y,所以尚未评估分配节点(对吗?).但这不是意味着张量流的结果可能是错误的吗?这意味着我们需要在每次操作期间创建新节点,以强制TensorFlow计算导致结果的所有节点.但是,如果说用10000个步骤训练神经网络,如果每个步骤都创建了一组新的1000个权重/参数集,那么空间复杂性不会爆炸吗?
The output of the code is 8. So I infer that since z=x+y,the assign node has not been evaluated(right?). But doesn't this mean that the result of tensorflow may be erroneous? This means we need to create new nodes during every operation to force TensorFlow to calculate all the nodes leading up to the result. But in say training a neural network with 10000 steps if each step creates a new set of 1000 weights/parameters won't the space complexity explode?
推荐答案
In the snippet you have posted, tf.control_dependencies
is not having any effect. The function creates a context where new operations are created with a control dependency to the given operations, but in your code there are no new operations within the context, just evaluation of previously existing operations.
在大多数情况下,TensorFlow中的控制流是显而易见的",从某种意义上说,只有一种方法可以正确地进行计算.但是,当涉及到有状态对象(即变量)时,可能会出现模棱两可的情况.考虑以下示例:
In most cases, control flow in TensorFlow is "obvious", in the sense that there is only one way to make a computation correctly. However, when stateful objects (i.e. variables) are involved, there are situations that may be ambiguous. Consider the following example:
import tensorflow as tf
v1 = tf.Variable(0)
v2 = tf.Variable(0)
upd1 = tf.assign(v1, v2 + 1)
upd2 = tf.assign(v2, v1 + 1)
init = tf.global_variables_initializer()
v1
和v2
都是初始化为0
并随后更新的变量.但是,每个更新都使用另一个变量的值.在常规的Python程序中,事情将按顺序运行,因此upd1
将首先运行(因此v1
将是1
),然后是upd2
(因此v2
将是2
,因为v1
是upd2
在upd1
之前运行(因此v1
将是2
,而v2
将是1
)或计算两个更新值(v2 + 1
和v1 + 1
)在分配之前(因此v1
和v2
都将在最后是1
).确实,如果我运行几次:
v1
and v2
are both variables initialized to 0
and then updated. However, each use the value of the other variable in the update. Ina regular Python program things would run sequentially, so upd1
would run first (so v1
would be 1
) and upd2
after (so v2
would be 2
, because v1
was 1
). But TensorFlow does not record the order in which operations are created, only their dependencies. So it may also happen that upd2
runs before upd1
(so v1
would be 2
and v2
would be 1
) or that both update values (v2 + 1
and v1 + 1
) are computed before the assignments (so both v1
and v2
would be 1
in the end). Indeed, if I run it several times:
for i in range(10):
with tf.Session() as sess:
sess.run(init)
sess.run([upd1, upd2])
print(*sess.run([v1, v2]))
我并非总是会得到相同的结果(个人上我会得到1 1
和2 1
,尽管从技术上讲1 2
也是可能的).例如,如果要在更新v1
之后计算v2
的新值,则可以执行以下操作:
I do not always get the same result (personally I get 1 1
and 2 1
, although technically 1 2
would also be possible). If for example you wanted to compute the new value for v2
after v1
has been updated, you could just do the following:
import tensorflow as tf
v1 = tf.Variable(0)
v2 = tf.Variable(0)
upd1 = tf.assign(v1, v2 + 1)
upd2 = tf.assign(v2, upd1 + 1)
init = tf.global_variables_initializer()
在这里,新值v2
是使用upd1
计算的,它保证是更新后变量的值.因此,这里upd2
将对该分配具有隐式依赖性,因此一切将按预期进行.
Here the new value v2
is computed using upd1
, which is guaranteed to be the value of the variable after the update. So here upd2
would have an implicit dependency to the assignment, and so things would work as expected.
但是如果您想始终使用未更新的变量值来计算v1
和v2
的新值(也就是说,始终以v1
和v2
都为1
结束)怎么办? ?在这种情况下,您可以使用 tf.control_dependencies
:
But what if you wanted to always compute the new values for v1
and v2
using the non-updated variable values (that is, consistently end up with both v1
and v2
being 1
)? In that case you can use tf.control_dependencies
:
import tensorflow as tf
v1 = tf.Variable(0)
v2 = tf.Variable(0)
new_v1 = v2 + 1
new_v2 = v1 + 1
with tf.control_dependencies([new_v1, new_v2]):
upd1 = tf.assign(v1, new_v1)
upd2 = tf.assign(v2, new_v2)
init = tf.global_variables_initializer()
在这里,只有在计算出v1
和v2
的新值之后才能进行赋值操作,因此在两种情况下它们的最终值始终为1
.
Here, the assignment operations cannot happen until the new values for v1
and v2
have been computed, so their final values will always be 1
in both cases.
这篇关于了解Tensorflow控制依赖项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!