我的GPU有2个多处理器,每个处理器具有48个CUDA内核。这是什么意思? [英] My GPU has 2 multiprocessors with 48 CUDA cores each. What does this mean?

查看:159
本文介绍了我的GPU有2个多处理器,每个处理器具有48个CUDA内核。这是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的GPU有2个多处理器,每个处理器具有48个CUDA内核。这是否意味着我可以并行执行96个线程块?

My GPU has 2 multiprocessors with 48 CUDA cores each. Does this mean that I can execute 96 thread blocks in parallel?

推荐答案

不行。

来自CUDA C编程指南的第4章:

From chapter 4 of the CUDA C programming guide:


可以驻留的块数和扭曲数对于给定的内核,它们在多处理器上一起处理的方式取决于内核使用的寄存器和共享内存的数量以及多处理器上可用的寄存器和共享内存的数量。每个多处理器还具有最大数量的驻留块和最大数量的驻留扭曲。这些限制以及多处理器上可用的寄存器和共享内存的数量是设备计算能力的函数,并在附录F中给出。如果每个多处理器没有足够的寄存器或共享内存来处理至少一个块, ,内核将无法启动。

The number of blocks and warps that can reside and be processed together on the multiprocessor for a given kernel depends on the amount of registers and shared memory used by the kernel and the amount of registers and shared memory available on the multiprocessor. There are also a maximum number of resident blocks and a maximum number of resident warps per multiprocessor. These limits as well the amount of registers and shared memory available on the multiprocessor are a function of the compute capability of the device and are given in Appendix F. If there are not enough registers or shared memory available per multiprocessor to process at least one block, the kernel will fail to launch.

在以下位置获取指南: http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide .pdf

Get the guide at: http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA_C_Programming_Guide.pdf

要检查特定设备的限制,并从SDK执行cudaDeviceQuery示例。

To check the limits for your specific device compile and execute the cudaDeviceQuery example from the SDK.

到目前为止,在所有计算功能中,每个多处理器的最大驻留块数相同,并且等于8。

So far the maximum number of resident blocks per multiprocessor is the same across all compute capabilities and is equal to 8.

这篇关于我的GPU有2个多处理器,每个处理器具有48个CUDA内核。这是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆