CUDA中可能的最大块数是多少? [英] What is the maximum block count possible in CUDA?
问题描述
理论上,每个网格的直径可以有65535块,高达65535 * 65535 * 65535。
Theoritically you can have 65535 blocks per diamension of the grid, up to 65535 * 65535 * 65535.
我的问题是:如果你调用这样的内核, code> kernel<< BLOCKS,THREADS>>>()(无dim3对象),BLOCKS的最大可用数量是多少?
My question is: If you call a kernel like this kernel<<< BLOCKS,THREADS >>>()
(without dim3 objects), what is the maximum number available for BLOCKS ?
应用程序,我已经设置到192000,似乎工作正常...问题是,我使用的内核更改了一个巨大的数组的内容,所以虽然我检查了数组的一些部分,似乎很好,我无法确定内核在其他部分是否表现奇怪。
In an application of mine, I've set it up to 192000 and seemed to work fine... The problem is that the kernel I used changes the contents of a huge array, so although I checked some parts of the array and seemed fine, I can't be sure whether the kernel behaved strangely at other parts.
对于记录我有一个2.1 GPU,GTX 500 ti。
For the record I have a 2.1 GPU, GTX 500 ti.
推荐答案
在一维中最多可以有65535个块。见表F-2。 CUDA的每个计算能力的技术规格(第136页) C编程指南4.1版。
You can have at most 65535 blocks in one dimension. See Table F-2. Technical Specifications per Compute Capability (page 136) of the CUDA C Programming Guide Version 4.1.
正如Pavan指出的,如果您不为网格配置提供dim3,那么您将只使用x维,因此每维度限制适用于此处。
As Pavan pointed out, if you do not provide a dim3 for grid configuration, you will only use the x-dimension, hence the per dimension limit applies here.
这篇关于CUDA中可能的最大块数是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!