cond 可以支持有副作用的 TF 操作吗? [英] Can cond support TF ops with side effects?

查看:22
本文介绍了cond 可以支持有副作用的 TF 操作吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

tf.cond 的(源代码)文档不清楚在评估谓词时要执行的函数是否会产生副作用.我已经做了一些测试,但我得到的结果相互矛盾.例如下面的代码不起作用:

The (source code) documentation for tf.cond is unclear on whether the functions to be performed when the predicate is evaluated can have side effects or not. I've done some tests but I'm getting conflicting results. For example the code below does not work:

import tensorflow as tf
from tensorflow.python.ops import control_flow_ops

pred = tf.placeholder(tf.bool, [])
count = tf.Variable(0)
adder = count.assign_add(1)
subtractor = count.assign_sub(2)

my_op = control_flow_ops.cond(pred, lambda: adder, lambda: subtractor)

sess = tf.InteractiveSession()
tf.initialize_all_variables().run()

my_op.eval(feed_dict={pred: True})
count.eval() # returns -1

my_op.eval(feed_dict={pred: False})
count.eval() # returns -2

即无论谓词评估为什么值,两个函数都会运行,因此最终结果是减 1.另一方面,此代码片段确实有效,唯一的区别是我向图中添加了新操作每次 my_op 被调用时:

I.e. no matter what value the predicate evaluates to, both functions are getting run, and so the net result is a subtraction of 1. On the other hand, this code snippet does work, where the only difference is that I add new ops to the graph every time my_op is called:

pred = tf.placeholder(tf.bool, [])
count = tf.Variable(0)

my_op = control_flow_ops.cond(pred, lambda:count.assign_add(1), lambda:count.assign_sub(2))

sess = tf.InteractiveSession()
tf.initialize_all_variables().run()

my_op.eval(feed_dict={pred: False})
count.eval() # returns -2

my_op.eval(feed_dict={pred: True})
count.eval() # returns -1

不确定为什么每次都创建新的操作而另一种情况则不起作用,但我显然宁愿不添加节点,因为图最终会变得太大.

Not sure why creating new ops every time works while the other case doesn't, but I'd obviously rather not be adding nodes as the graph will eventually become too big.

推荐答案

你的第二个版本——assign_add()assign_sub() 操作在 lambdas 中创建传递给 cond()— 是正确的方法.幸运的是,在调用 cond() 期间,这两个 lambda 表达式中的每一个都只计算一次,因此您的图不会无限制地增长.

Your second version—where the assign_add() and assign_sub() ops are creating inside the lambdas passed to cond()—is the correct way to do this. Fortunately, each of the two lambdas is only evaluated once, during the call to cond(), so your graph will not grow without bound.

本质上 cond() 的作用如下:

Essentially what cond() does is the following:

  1. 创建一个 Switch 节点,根据 pred 的值,该节点仅将其输入转发到两个输出之一.让我们调用输出 pred_truepred_false.(它们与 pred 具有相同的值,但这并不重要,因为它从未被直接计算过.)

  1. Create a Switch node, which forwards its input to only one of two outputs, depending on the value of pred. Let's call the outputs pred_true and pred_false. (They have the same value as pred but that's unimportant since this is never directly evaluated.)

构建对应于 if_true lambda 的子图,其中所有节点都对 pred_true 有控制依赖.

Build the subgraph corresponding to the if_true lambda, where all of the nodes have a control dependency on pred_true.

构建对应于 if_false lambda 的子图,其中所有节点都对 pred_false 有控制依赖.

Build the subgraph corresponding to the if_false lambda, where all of the nodes have a control dependency on pred_false.

将来自两个 lambda 表达式的返回值列表压缩在一起,并为每个返回值创建一个 Merge 节点.Merge 节点接受两个输入,预计只生成一个,并将其转发到其输出.

Zip together the lists of return values from the two lambdas, and create a Merge node for each of these. A Merge node takes two inputs, of which only one is expected to be produced, and forwards it to its output.

返回作为 Merge 节点输出的张量.

Return the tensors that are the outputs of the Merge nodes.

这意味着您可以运行您的第二个版本,并且无论您运行多少步骤,图形都保持固定大小.

This means you can run your second version, and be content that the graph remains a fixed size, regardless of how many steps you run.

你的第一个版本不起作用的原因是,当一个 Tensor 被捕获时(比如你的例子中的 addersubtractor ),添加了一个额外的 Switch 节点来强制执行张量的值只转发到实际执行的分支的逻辑.这是 TensorFlow 如何在其执行模型中结合前馈数据流和控制流的产物.结果是捕获的张量(在本例中为 assign_addassign_sub 的结果)将始终被评估,即使它们没有被使用,你将看看它们的副作用.这是我们需要更好地记录的事情,并且正如迈克尔所说,我们将在未来使其更有用.

The reason your first version doesn't work is that, when a Tensor is captured (like adder or subtractor in your example), an additional Switch node is added to enforce the logic that the value of the tensor is only forwarded to the branch that actually executes. This is an artifact of how TensorFlow combines feed-forward dataflow and control flow in its execution model. The result is that the captured tensors (in this case the results of the assign_add and assign_sub) will always be evaluated, even if they aren't used, and you'll see their side effects. This is something we need to document better, and as Michael says, we're going to make this more usable in future.

这篇关于cond 可以支持有副作用的 TF 操作吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆