首先tf.session.run()的执行与以后的运行截然不同。为什么? [英] First tf.session.run() performs dramatically different from later runs. Why?

查看:683
本文介绍了首先tf.session.run()的执行与以后的运行截然不同。为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这里是一个示例来阐明我的意思:

第一次session.run():

首次运行TensorFlow会话

Here's an example to clarify what I mean:
First session.run():
First run of a TensorFlow session

稍后session.run():

< a href = https://i.stack.imgur.com/cKjb1.png rel = noreferrer>稍后运行TensorFlow会话

Later session.run():
Later runs of a TensorFlow session

我了解TensorFlow在这里进行了一些初始化,但是我想知道它在源代码中出现的位置。这在CPU和GPU上均会发生,但在GPU上的影响更为明显。例如,在显式Conv2D操作的情况下,第一次运行在GPU流中具有大量Conv2D操作。实际上,如果我更改Conv2D的输入大小,则它可以从数十个流转换为Conv2D操作。但是,在以后的运行中,GPU流中始终只有五个Conv2D操作(与输入大小无关)。在CPU上运行时,与以后的运行相比,我们在第一次运行中保留了相同的操作列表,但我们确实看到了相同的时间差异。

I understand TensorFlow is doing some initialization here, but I'd like to know where in the source this manifests. This occurs on CPU as well as GPU, but the effect is more prominent on GPU. For example, in the case of a explicit Conv2D operation, the first run has a much larger quantity of Conv2D operations in the GPU stream. In fact, if I change the input size of the Conv2D, it can go from tens to hundreds of stream Conv2D operations. In later runs, however, there are always only five Conv2D operations in the GPU stream (regardless of input size). When running on CPU, we retain the same operation list in the first run compared to later runs, but we do see the same time discrepancy.

TensorFlow源的哪一部分是对这种行为负责? GPU操作在哪里分裂?

What portion of TensorFlow source is responsible for this behavior? Where are GPU operations "split?"

感谢帮助!

推荐答案

tf.nn.conv_2d() op在第一个 tf.Session.run() 的调用是因为默认情况下,TensorFlow使用cuDNN的自动调整功能来选择如何尽快运行后续卷积。您可以在此处看到自动调谐调用

The tf.nn.conv_2d() op takes much longer to run on the first tf.Session.run() invocation because—by default—TensorFlow uses cuDNN's autotune facility to choose how to run subsequent convolutions as fast as possible. You can see the autotune invocation here.

有一个未记录的环境变量,可用于禁用自动调整。当启动运行TensorFlow的进程(例如 python 解释器)时,设置 TF_CUDNN_USE_AUTOTUNE = 0 以禁用其使用。

There is an undocumented environment variable that you can use to disable autotune. Set TF_CUDNN_USE_AUTOTUNE=0 when you start the process running TensorFlow (e.g. the python interpreter) to disable its use.

这篇关于首先tf.session.run()的执行与以后的运行截然不同。为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆