了解 tensorflow 分析结果 [英] Understanding tensorflow profiling results

查看：55 发布时间：2021/6/21 20:17:44 tensorflow profiling

本文介绍了了解 tensorflow 分析结果的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

/job:localhost/replica:0/task:0/gpu:0 计算(pid 3)

我的问题:

a)每一行的含义是什么.

b) 特别是 /gpu:0/stream:all Compute(pid 5) 和 /job:localhost/replica:0/task:0/gpu 有什么区别:0 计算(pid 3).

c)为什么它们的执行时间不同，分别是0.072ms和0.094ms.

解决方案

以下是一位工程师的更新:

'/gpu:0/stream:*' 时间线是 CUDA 内核执行时间的硬件跟踪.

'/gpu:0' 行是 TF 软件设备在 CUDA 流上排队操作(通常几乎花费零时间)

This example shows how to profile tensorflow programs. I have used this tool to profile my program, a simple LSTM. And the results is shown as:

/gpu:0/stream:all Compute(pid 5)

/job:localhost/replica:0/task:0/gpu:0 Compute(pid 3)

My question :

a)what is the meaning of each row.

b)Especially what is the difference between /gpu:0/stream:all Compute(pid 5) and /job:localhost/replica:0/task:0/gpu:0 Compute(pid 3).

c)Why their execution time are different, namely 0.072ms and 0.094ms.

解决方案

Here's an update from one of the engineers:

The '/gpu:0/stream:*' timelsines are hardware tracing of CUDA kernel execution times.

The '/gpu:0' lines are the TF software device enqueueing the ops on the CUDA stream (usually takes almost zero time)

这篇关于了解 tensorflow 分析结果的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文