“gld / st_throughput”之间有什么区别?和“dram_read / write_throughput”指标? [英] What's the difference between "gld/st_throughput" and "dram_read/write_throughput" metrics?
问题描述
在CUDA可视化分析器版本5中,我知道gld / st_requested_throughput是应用程序请求的内存吞吐量。但是,当我尝试找到硬件的实际吞吐量时,我感到困惑,因为有两对似乎合格的度量,他们是gld / st_throughput和dram_read / write_throughput。哪对实际上是硬件吞吐量?
In the CUDA visual profiler, version 5, I know that the "gld/st_requested_throughput" are the requested memory throughput of application. However, when I try to find the actual throughput of hardware, I am confused because there are two pairs of metrics which seem to be qualified, and they are "gld/st_throughput" and "dram_read/write_throughput". Which pair is actually the hardware throughput? And what does the other serve as?
推荐答案
gld / st_throughput
由L1和L2高速缓存服务的事务。而 dram_read / write_throughput
是L2和设备内存之间的吞吐量。因此,每个全局内存访问计数 gld / st_throughput
,但只有请求丢失L1和L2缓存计数 dram_read / write_throughput
。
gld/st_throughput
includes transactions served by the L1 and L2 caches. While dram_read/write_throughput
is the throughput between L2 and device memory. So, each global memory access counts towards gld/st_throughput
but only requests that missed both the L1 and L2 caches count towards dram_read/write_throughput
.
我没有找到任何地方的计数器的好概述。希望NVIDIA提供...
I haven't found a good overview of the counters anywhere. Wish NVIDIA would provide that...
这篇关于“gld / st_throughput”之间有什么区别?和“dram_read / write_throughput”指标?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!