2D/3D CUDA 块是如何划分为 warp 的? [英] How are 2D / 3D CUDA blocks divided into warps?

查看:17
本文介绍了2D/3D CUDA 块是如何划分为 warp 的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我用一个块有尺寸的网格开始我的内核:

If I start my kernel with a grid whose blocks have dimensions:

dim3 block_dims(16,16);

网格块现在如何分裂成经线?这种块的前两行形成一个扭曲,还是前两列,或者这是任意排序的?

How are the grid blocks now split into warps? Do the first two rows of such a block form one warp, or the first two columns, or is this arbitrarily-ordered?

假设 GPU 计算能力为 2.0.

Assume a GPU Compute Capability of 2.0.

推荐答案

线程在块内按顺序编号,使得 threadIdx.x 变化最快,然后 threadIdx.y 变化第二快,threadIdx.z 变化最慢.这在功能上与多维数组中的列主要排序相同.Warp 是按此顺序从线程按顺序构造的.所以二维块的计算是

Threads are numbered in order within blocks so that threadIdx.x varies the fastest, then threadIdx.y the second fastest varying, and threadIdx.z the slowest varying. This is functionally the same as column major ordering in multidimensional arrays. Warps are sequentially constructed from threads in this ordering. So the calculation for a 2d block is

unsigned int tid = threadIdx.x + threadIdx.y * blockDim.x;
unsigned int warpid = tid / warpSize;

这在编程指南和 PTX 指南中都有介绍.

This is covered both in the programming guide and the PTX guide.

这篇关于2D/3D CUDA 块是如何划分为 warp 的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆