为什么CUDA Profiler指示重放的指令:82%!=全局重放+本地重放+共享重放? [英] Why does CUDA Profiler indicate replayed instructions: 82% != global replay + local replay + shared replay?

查看:130
本文介绍了为什么CUDA Profiler指示重放的指令:82%!=全局重放+本地重放+共享重放?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从CUDA Profiler获得信息。我很困惑为什么
Replay Instruction!= Grobal内存回放+本地内存回放+共享银行冲突重放?

I got information from CUDA Profiler. I am so confused why Replays Instruction != Grobal memory replay + Local memory replay + Shared bank conflict replay?

查看以下信息我从profiler:

See the following information I got from profiler:

Replayed Instructions(%): 81.60
Global memory replay(%): 21.80
Local memory replays(%): 0.00
Shared bank conflict replay(%): 0.00

我解释这个?是否有其他情况导致指令重放?

Could you help me explain this? Is there any other case causing instruction replay?

推荐答案

因为SM可能由于其他因素

Because The SM can replay instructions due to other factors, like different branching logic.

所以我可以假设你的代码的60%是由于分支和20%由于全局内存重新发行。

So I can assume that 60% of your code is being reissued due to branching and 20% due to global memory. Can you post a snippet ?

在Cuda 4.0分析器的F1帮助菜单中:

From the F1 Help menu of the Cuda 4.0 profiler:


重播说明(%)这提供了在内核执行期间重播的
指令的百分比。重放的指令
是硬件实际发出的
的指令数量与内核要执行的
指令数量之间的差额。理想情况下,这应该为零。这是
计算为100 *(发出的指令 - 指令执行)/
发出的指令

Replayed Instructions (%) This gives the percentage of instructions replayed during kernel execution. Replayed instructions are the difference between the numbers of instructions that are actually issued by the hardware to the number of instructions that are to be executed by the kernel. Ideally this should be zero. This is calculated as 100 * (instructions issued - instruction executed) / instruction issued

全局内存重放(% strong>由于全局内存访问引起的重放指令
的百分比。这是计算为100 *(l1
全局载入未命中)/发出的指令

Global memory replay (%) Percentage of replayed instructions caused due to global memory accesses. This is calculated as 100 * (l1 global load miss) / instructions issued

本地内存重放(%)指令导致
由于本地内存访问。这被计算为100 *(l1本地
加载未命中+ l1本地存储未命中)/发出的指令

Local memory replay (%) Percentage of replayed instructions caused due to local memory accesses. This is calculated as 100 * (l1 local load miss + l1 local store miss) / instructions issued

共享银行冲突重放/ strong>由于共享内存条冲突导致的重播
指令的百分比。这是
计算为100 *(l1共享冲突)/发出的指令

Shared bank conflict replay (%) Percentage of replayed instructions caused due to shared memory bank conflicts. This is calculated as 100 * (l1 shared conflict)/ instructions issued

这篇关于为什么CUDA Profiler指示重放的指令:82%!=全局重放+本地重放+共享重放?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆