2D/3D CUDA块如何划分为变形? [英] How are 2D / 3D CUDA blocks divided into warps?

查看:116
本文介绍了2D/3D CUDA块如何划分为变形?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我从一个具有尺寸块的网格开始我的内核:

If I start my kernel with a grid whose blocks have dimensions:

dim3 block_dims(16,16);

现在如何将网格块拆分为扭曲?这样的块的前两行是形成一个翘曲,还是前两列,或者是任意排序的?

How are the grid blocks now split into warps? Do the first two rows of such a block form one warp, or the first two columns, or is this arbitrarily-ordered?

假定GPU计算能力为2.0.

Assume a GPU Compute Capability of 2.0.

推荐答案

在块内按顺序对线程进行编号,以使threadIdx.x变化最快,然后threadIdx.y变化最快,而threadIdx.z变化最快.这在功能上与多维数组中的列主排序相同.从线程按此顺序顺序构造经线.因此,对于2d块的计算是

Threads are numbered in order within blocks so that threadIdx.x varies the fastest, then threadIdx.y the second fastest varying, and threadIdx.z the slowest varying. This is functionally the same as column major ordering in multidimensional arrays. Warps are sequentially constructed from threads in this ordering. So the calculation for a 2d block is

unsigned int tid = threadIdx.x + threadIdx.y * blockDim.x;
unsigned int warpid = tid / warpSize;

这在编程指南和PTX指南中都有介绍.

This is covered both in the programming guide and the PTX guide.

这篇关于2D/3D CUDA块如何划分为变形?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆