CUDA线程如何划分为经线? [英] How are CUDA threads divided into warps?

查看:269
本文介绍了CUDA线程如何划分为经线?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我使用线程网格启动内核(例如只有1个块):

If I start my kernel with a thread grid (only 1 block for example):

dim3 threads(16,16);

这个网格现在如何分割成经纱?这个网格的前两行是经线还是前两栏,还是这是任意排序的?假设GPU计算能力为2.0,翘曲大小为32。

How is this grid now split into warps? Are the first two rows of this grid one warp, or the first two columns, or is this arbitrarily-ordered? Assume a GPU Compute Capability of 2.0 and a warp size of 32.

推荐答案

线程在块内按顺序编号, c $ c> threadIdx.x 变化最快,然后 threadIdx.y 第二快的变化, threadIdx.z 最慢的变化。这在功能上与多维数组中的列主要排序相同。在该顺序中,从线程顺序地构造经纱。因此,2d块的计算是

Threads are numbered in order within blocks so that threadIdx.x varies the fastest, then threadIdx.y the second fastest varying, and threadIdx.z the slowest varying. This is functionally the same as column major ordering in multidimensional arrays. Warps are sequentially constructed from threads in this ordering. So the calculation for a 2d block is

unsigned int tid = threadIdx.x + threadIdx.y * blockDim.x;
unsigned int warpid = tid / warpSize;

这在节目指南和PTX指南中都有介绍。

This is covered both in the programming guide and the PTX guide.

这篇关于CUDA线程如何划分为经线?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆