在CUDA分析器nvvp中,“共享/全局存储器重放开销”意思?如何计算? [英] In CUDA profiler nvvp, what does the "Shared/Global Memory Replay Overhead" mean? How is it computed?

查看:687
本文介绍了在CUDA分析器nvvp中,“共享/全局存储器重放开销”意思?如何计算?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我们使用 CUDA profiler nvvp 时,有几个开销与指令相关,例如:

When we use CUDA profiler nvvp, there are several "overhead"s correlated with instructions, for example:


  • 分支发散开销;

  • 共享/全局内存回放开销;和

  • 本地/全局高速缓存重放开销。

我的问题是:



  1. 同样,如何计算全局负载/存储效率?

附件:我在CUDA5工具包中的CUDA Profiler用户指南中找到了计算这些开销的所有公式。

Attachment: I've found all the formulas computing these overheads in the 'CUDA Profiler Users Guide' packed in CUDA5 toolkit.

推荐答案

您可以在这里找到您问题的答案:

You can find some of the answers to your question here:

为什么CUDA Profiler指示重播说明: 82%!=全球回放+本地回放+分享重播?

Why does CUDA Profiler indicate replayed instructions: 82% != global replay + local replay + shared replay?


重播说明>这给出了在内核执行期间重放的
指令的百分比。重放的指令
是硬件实际发出的
的指令数量与内核要执行的
指令数量之间的差额。理想情况下,这应该为零。这是
计算为100 *(发出的指令 - 指令执行)/
发出的指令

Replayed Instructions (%) This gives the percentage of instructions replayed during kernel execution. Replayed instructions are the difference between the numbers of instructions that are actually issued by the hardware to the number of instructions that are to be executed by the kernel. Ideally this should be zero. This is calculated as 100 * (instructions issued - instruction executed) / instruction issued

全局内存重放(% strong>由于全局内存访问引起的重放指令
的百分比。这是计算为100 *(l1
全局载入未命中)/发出的指令

Global memory replay (%) Percentage of replayed instructions caused due to global memory accesses. This is calculated as 100 * (l1 global load miss) / instructions issued

本地内存重放(%)指令导致
由于本地内存访问。这被计算为100 *(l1本地
加载未命中+ l1本地存储未命中)/发出的指令

Local memory replay (%) Percentage of replayed instructions caused due to local memory accesses. This is calculated as 100 * (l1 local load miss + l1 local store miss) / instructions issued

共享银行冲突重放/ strong>由于共享内存条冲突导致的重播
指令的百分比。这是
计算为100 *(l1共享冲突)/发出的指令

Shared bank conflict replay (%) Percentage of replayed instructions caused due to shared memory bank conflicts. This is calculated as 100 * (l1 shared conflict)/ instructions issued

这篇关于在CUDA分析器nvvp中,“共享/全局存储器重放开销”意思?如何计算?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆