Cuda gridDim 和 blockDim [英] Cuda gridDim and blockDim

查看:26
本文介绍了Cuda gridDim 和 blockDim的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道 blockDim 是什么,但是 gridDim 有问题.Blockdim 给出了块的大小,但是 gridDim 是什么?在互联网上,它说 gridDim.x 给出了 x 坐标中的块数.

I get what blockDim is, but I have a problem with gridDim. Blockdim gives the size of the block, but what is gridDim? On the Internet it says gridDim.x gives the number of blocks in the x coordinate.

我怎么知道 blockDim.x * gridDim.x 给出了什么?

How can I know what blockDim.x * gridDim.x gives?

我如何知道 x 行中有多少个 gridDim.x 值?

How can I know that how many gridDim.x values are there in the x line?

例如,考虑下面的代码:

For example, consider the code below:

int tid = threadIdx.x + blockIdx.x * blockDim.x;
double temp = a[tid];
tid += blockDim.x * gridDim.x;

while (tid < count)
{
    if (a[tid] > temp)
    {
       temp = a[tid];
    }
    tid += blockDim.x * gridDim.x;
}

我知道 tid 以 0 开头.然后代码有 tid+=blockDim.x * gridDim.x.操作后 tid 现在是什么?

I know that tid starts with 0. The code then has tid+=blockDim.x * gridDim.x. What is tid now after this operation?

推荐答案

  • blockDim.x,y,z 给出块中的线程数,在特定方向
  • gridDim.x,y,z 给出网格中的块数,在特定方向
  • blockDim.x * gridDim.x 给出网格中的线程数(在本例中为 x 方向)
    • blockDim.x,y,z gives the number of threads in a block, in the particular direction
    • gridDim.x,y,z gives the number of blocks in a grid, in the particular direction
    • blockDim.x * gridDim.x gives the number of threads in a grid (in the x direction, in this case)
    • 块和网格变量可以是 1、2 或 3 维的.处理一维数据时通常只创建一维块和网格.

      block and grid variables can be 1, 2, or 3 dimensional. It's common practice when handling 1-D data to only create 1-D blocks and grids.

      在 CUDA 文档中,定义了这些变量 这里

      In the CUDA documentation, these variables are defined here

      特别是,当 x 维度 (gridDim.x*blockDim.x) 中的总线程小于我希望处理的数组的大小时,那么通常的做法是创建一个循环并让线程网格在整个数组中移动.在这种情况下,在处理一次循环迭代后,每个线程必须移动到下一个未处理的位置,由 tid+=blockDim.x*gridDim.x; 给出.实际上,整个线程网格正在跳过一维数据数组,一次一个网格宽度.这个主题,有时称为网格跨步循环",在 博客文章.

      In particular, when the total threads in the x-dimension (gridDim.x*blockDim.x) is less than the size of the array I wish to process, then it's common practice to create a loop and have the grid of threads move through the entire array. In this case, after processing one loop iteration, each thread must then move to the next unprocessed location, which is given by tid+=blockDim.x*gridDim.x; In effect, the entire grid of threads is jumping through the 1-D array of data, a grid-width at a time. This topic, sometimes called a "grid-striding loop", is further discussed in this blog article.

      您可能需要考虑参加 NVIDIA 网络研讨会页面上提供的几个介绍性 CUDA 网络研讨会.比如这2个:

      You might want to consider taking a couple of the introductory CUDA webinars available on the NVIDIA webinar page. For example, these 2:

      • GPU Computing using CUDA C – An Introduction (2010) 简介使用 CUDA C 进行 GPU 计算的基础知识.通过代码示例的演练进行说明.没有先前的 GPU 计算需要经验
      • 使用 CUDA C 进行 GPU 计算 - 高级 1 (2010) 第一级优化技术,例如全局内存优化,以及处理器利用率.将使用真实代码说明概念示例

      如果您想更好地理解这些概念,花 2 个小时会很合适.

      It would be 2 hours well spent, if you want to understand these concepts better.

      详细介绍了网格跨步循环的一般主题 这里.

      The general topic of grid-striding loops is covered in some detail here.

      这篇关于Cuda gridDim 和 blockDim的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆