我的GPU上的线程总数,块数和网格数. [英] Number of total threads, blocks, and grids on my GPU.
问题描述
对于NVIDIA GEFORCE 940mx GPU,设备查询显示每个MP具有3个多处理器和128个内核.
For the NVIDIA GEFORCE 940mx GPU, Device Query shows it has 3 Multiprocessor and 128 cores for each MP.
每个多处理器的线程数= 2048
Number of threads per multiprocessor=2048
因此,3 * 2048 = 6144.ie. GPU中共有6144个线程.
So, 3*2048=6144.ie. total 6144 threads in GPU.
6144/1024 = 6,即总共6个街区.经线大小为32.
6144/1024=6 ,ie. total 6 blocks. And warp size is 32.
但是从此视频中 https://www.youtube.com/watch?v = kzXjRFL-gjo 我发现每个GPU对线程都有限制,但对块数没有限制.
But from this video https://www.youtube.com/watch?v=kzXjRFL-gjo i found that each GPU has limit on threads, but no limit on Number of blocks.
所以我对此感到困惑.我想知道
So i got confused with this. I would like to know
- 我的GPU中共有多少个线程?我们可以将所有线程用于 执行程序?
- 有多少块和网格?
- How many total threads are in my GPU? Can we use all threads for execute a program?
- How many blocks and Grids are there?
推荐答案
您似乎感到困惑的主要根源在于混淆了两组完全不同的限制:
It appears the main source of your confusion is mixing up two completely different sets of limits:
- 可以在GPU上同时运行的最大线程和块数.
- 可为给定内核启动的最大线程和块数.
您引用的数字(每个多处理器2048个线程,总共三个多处理器= 6144个线程代表第一组限制.您在deviceQuery
输出的屏幕截图中显示的数字:
The numbers you quote (2048 threads per multiprocessor, three multiprocessors in total = 6144 threads represent the first set of limits. The numbers you show in your screenshot of the deviceQuery
output:
Maximum number of threads per multiprocessor: 2048
Maximum number of threads per block: 1024
Max dimension size of a thread block (x,y,z): (1024, 1024, 64)
Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535)
定义给定内核启动的限制.虽然它们有些重叠,但是您可以将它们或多或少地分开对待.有关内核启动参数和块尺寸的实用性的更详尽讨论,请参见
define the limits of a given kernel launch. While they overlap somewhat, you can treat them as more or less separate. For a more thorough discussion of the practicalities of kernel launch parameters and block dimensions, see here.
这篇关于我的GPU上的线程总数,块数和网格数.的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!