排队时强制复制张量 [英] Force copy of tensor when enqueuing

查看:10
本文介绍了排队时强制复制张量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我不确定标题是否很好,但鉴于我对情况的了解,这是我能想到的最好的.

first, I'm not sure if the title is very good, but it was the best I could come up with given my understanding of the situation.

背景是我试图了解队列在 tensorflow 中是如何工作的,但遇到了以下让我感到困惑的问题.

The background is that I'm trying to understand how queues work in tensorflow and ran into the following issue which puzzled me.

我有一个变量 n,我将它排入 tf.FIFOQueue,然后我增加了该变量.这会重复几次,人们会期望结果类似于 0, 1, 2, ... 但是,当清空队列时,所有值都是相同的.

I have a variable n, which I enqueue to a tf.FIFOQueue, and then I increment the variable. This is repeated several times, and one would expect a result similar to 0, 1, 2, ... However, when emptying the queue all values are the same.

更准确地说,代码如下:

More precisely, the code is as follows:

from __future__ import print_function

import tensorflow as tf

q = tf.FIFOQueue(10, tf.float32)

n = tf.Variable(0, trainable=False, dtype=tf.float32)
inc = n.assign(n+1)
enqueue = q.enqueue(n)

init = tf.global_variables_initializer()

sess = tf.Session()
sess.run(init)

sess.run(enqueue)
sess.run(inc)

sess.run(enqueue)
sess.run(inc)

sess.run(enqueue)
sess.run(inc)

print(sess.run(q.dequeue()))
print(sess.run(q.dequeue()))
print(sess.run(q.dequeue()))

我希望打印的内容:

0.0
1.0
2.0

相反,我得到以下结果:

Instead I get the following result:

3.0
3.0
3.0

似乎我正在将一些指向 n 的指针推送到队列,而不是实际值,这正是我想要的.但是,我对 tensorflow 内部结构并没有任何实际了解,所以也许还有其他事情发生?

It seems like I'm pushing some pointer to n to the queue, instead of the actual value, which is what I want. However, I don't really have any actual understanding of tensorflow internals, so maybe something else is going on?

我尝试改变

enqueue = q.enqueue(n)

enqueue = q.enqueue(tf.identity(n))

如何在 tensorflow 中复制变量在 TensorFlow 中,tf.identity 是做什么用的? 给我的印象是它可能会有所帮助,但它不会改变结果.我也尝试添加一个 tf.control_dependencies(),但同样,出列时所有值都相同.

since answers to How can I copy a variable in tensorflow and In TensorFlow, what is tf.identity used for? gives me the impression that it might help, but it does not change the result. I also tried adding a tf.control_dependencies(), but again, all values are the same when dequeueing.

上面的输出是在具有单个 CPU 的计算机上运行代码,当尝试查看不同版本的 tensorflow 之间是否存在某些差异时,我注意到我是否在具有 CPU 和 GPU 的计算机上运行代码我得到了预期"的结果.事实上,如果我使用 CUDA_VISIBLE_DEVICES="" 运行,我会得到上面的结果,而使用 CUDA_VISIBLE_DEVICES="0" 我会得到预期的"结果.

The output above is from running the code on a computer with a single CPU, when trying to see if there was some difference between different versions of tensorflow, I noticed if I run the code on a computer with CPU and GPU I get the "expected" result. Indeed, if I run with CUDA_VISIBLE_DEVICES="" I get the result above, and with CUDA_VISIBLE_DEVICES="0" I get the "expected" result.

推荐答案

要强制非缓存读取,你可以这样做

To force a non-caching read you can do

q.enqueue(tf.add(q, 0))

这就是 当前由批量标准化层完成以强制复制.

变量读取与引用的语义正在改进中,因此它们暂时不直观.特别是,我希望 q.enqueue(v.read_value()) 强制进行非缓存读取,但它没有修复您在 TF 0.12rc1 上的示例

Semantics of how variables get read vs. referenced are in the process of getting revamped so they are temporarily non-intuitive. In particular, I expected q.enqueue(v.read_value()) to force a non-caching read, but it doesn't fix your example on TF 0.12rc1

使用 GPU 机器将变量放在 GPU 上,而队列只是 CPU,所以 enqueue op 强制 GPU->CPU 复制.

Using GPU machine puts variable on GPU, while Queue is CPU only, so enqueue op forces a GPU->CPU copy.

这篇关于排队时强制复制张量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆