NVIDIA Fermi中的二级缓存 [英] L2 cache in NVIDIA Fermi

查看:79
本文介绍了NVIDIA Fermi中的二级缓存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当查看NVIDIA Fermi架构中的性能计数器的名称(cuda的doc文件夹中的Compute_profiler.txt文件)时,我注意到对于L2缓存未命中,有两个性能计数器,即l2_subp0_read_sector_misses和l2_subp1_read_sector_misses。他们说这些是两片L2。

When looking at the name of the performance counters in NVIDIA Fermi architecture (the file Compute_profiler.txt in the doc folder of cuda), I noticed that for L2 cache misses, there are two performance counters, l2_subp0_read_sector_misses and l2_subp1_read_sector_misses. They said that these are for two slices of L2.

为什么它们有两片L2?与流多处理器体系结构有关系吗?

Why do they have two slices of L2? Is there any relation with the Streaming Multi-processor architecture? What would be the effect of this division to the performance?

谢谢

推荐答案

我认为与流式多处理器没有任何直接关系。

I don't think there is any direct relation with the streaming multiprocessor.

我只是认为该片相当于银行的内存。

I just think that slice is equivalent of bank memory.

只需将两个值相加即可得出总的 L2读取未命中率。

Just sum the values of the two to get the "total" L2 read misses.

这篇关于NVIDIA Fermi中的二级缓存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆