了解 tensorflow 分析结果 [英] Understanding tensorflow profiling results

查看:55
本文介绍了了解 tensorflow 分析结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

/job:localhost/replica:0/task:0/gpu:0 计算(pid 3)

我的问题:

a)每一行的含义是什么.

b) 特别是 /gpu:0/stream:all Compute(pid 5)/job:localhost/replica:0/task:0/gpu 有什么区别:0 计算(pid 3).

c)为什么它们的执行时间不同,分别是0.072ms0.094ms.

解决方案

以下是一位工程师的更新:

'/gpu:0/stream:*' 时间线是 CUDA 内核执行时间的硬件跟踪.

'/gpu:0' 行是 TF 软件设备在 CUDA 流上排队操作(通常几乎花费零时间)

This example shows how to profile tensorflow programs. I have used this tool to profile my program, a simple LSTM. And the results is shown as:

/gpu:0/stream:all Compute(pid 5)

/job:localhost/replica:0/task:0/gpu:0 Compute(pid 3)

My question :

a)what is the meaning of each row.

b)Especially what is the difference between /gpu:0/stream:all Compute(pid 5) and /job:localhost/replica:0/task:0/gpu:0 Compute(pid 3).

c)Why their execution time are different, namely 0.072ms and 0.094ms.

解决方案

Here's an update from one of the engineers:

The '/gpu:0/stream:*' timelsines are hardware tracing of CUDA kernel execution times.

The '/gpu:0' lines are the TF software device enqueueing the ops on the CUDA stream (usually takes almost zero time)

这篇关于了解 tensorflow 分析结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆