麻烦的计算偏移指数为3D阵列 [英] trouble calculating offset index into 3D array

查看:99
本文介绍了麻烦的计算偏移指数为3D阵列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写一个CUDA内核中的行创建为每个位置一个3x3协方差矩阵* COLS主要基质。使3D矩阵是行* COLS * 9的尺寸,这是我在一个单一的malloc分配相应。我需要访问这个单一指标值

I am writing a CUDA kernel to create a 3x3 covariance matrix for each location in the rows*cols main matrix. So that 3D matrix is rows*cols*9 in size, which i allocated in a single malloc accordingly. I need to access this in a single index value

3×3协方差矩阵的9值获取值根据来自一些其它二维阵列的相应行r和列c设置。

the 9 values of the 3x3 covariance matrix get their values set according to the appropriate row r and column c from some other 2D arrays.

在换言之 - 我需要计算相应的索引来访问的3x3协方差矩阵的9种元素,以及二维矩阵为输入的值的偏移的行和列,以及相应的索引存储阵列。

In other words - I need to calculate the appropriate index to access the 9 elements of the 3x3 covariance matrix, as well as the row and column offset of the 2D matrices that are inputs to the value, as well as the appropriate index for the storage array.

我试图简化它归结为以下几点:

i have tried to simplify it down to the following:

   //I am calling this kernel with 1D blocks who are 512 cols x 1row. TILE_WIDTH=512
   int bx = blockIdx.x;
   int by = blockIdx.y;
   int tx = threadIdx.x;
   int ty = threadIdx.y;
   int r = by + ty; 
   int c = bx*TILE_WIDTH + tx;
   int offset = r*cols+c; 
   int ndx = r*cols*rows + c*cols;


   if((r < rows) && (c < cols)){ //this IF statement is trying to avoid the case where a threadblock went bigger than my original array..not sure if correct

      d_cov[ndx + 0] = otherArray[offset];//otherArray just contains a value that I might do some operations on to set each of the ndx0-ndx9 values in d_cov
      d_cov[ndx + 1] = otherArray[offset];
      d_cov[ndx + 2] = otherArray[offset];
      d_cov[ndx + 3] = otherArray[offset];
      d_cov[ndx + 4] = otherArray[offset];
      d_cov[ndx + 5] = otherArray[offset];  
      d_cov[ndx + 6] = otherArray[offset];
      d_cov[ndx + 7] = otherArray[offset];   
      d_cov[ndx + 8] = otherArray[offset];  
   }

当我检查这个数组与CPU计算的值,它会遍历i =行,J = COLS,K = 1..9

When I check this array with the values calculated on the CPU, which loops over i=rows, j=cols, k = 1..9

结果不匹配。

在d_cov [我*行* COLS + J * COLS + K]!= correctAnswer [I] [J] [K]

in other words d_cov[i*rows*cols + j*cols + k] != correctAnswer[i][j][k]

谁能给我如何来解答这个问题的任何提示?是它的一个分度的问题,或一些其他的逻辑错误?

Can anyone give me any tips on how to sovle this problem? Is it an indexing problem, or some other logic error?

推荐答案

而不是答案(我还没有盯着够难找到),这里的技巧,我通常用于调试此类问题。首先,你的目标阵列楠设置所有值。 (您可以通过cudaMemset做到这一点 - 设置每个字节为0xFF)。然后想尽位置设置统一的行的值,然后检查结果。从理论上讲,它应该是这个样子:

Rather than the answer (which I haven't stared hard enough to find), here's the technique I usually use for debugging these sorts of issues. First, set all values in your destination array to NaN. (You can do this via cudaMemset -- set every byte to 0xFF.) Then try uniformly setting every location to the value of the row, then inspect the results. In theory, it should look something like:

0 0 0 ... 0
1 1 1 ... 1
. . . .   .
. . .  .  .
. . .   . .
n n n ... n

如果你看到NaN的,你们没有写入的元素;如果你看到行元素出来的地方,什么是错的,他们通常会出发生在暗示模式。做与列值类似的东西,并与平面。通常情况下,这一招可以帮助我找到指数计算的一部分是出差错,这是大多数的战斗。希望有所帮助。

If you see NaNs, you've failed to write to an element; if you see row elements out of place, something is wrong, and they'll usually be out of place in a suggestive pattern. Do something similar with the column value, and with the plane. Usually, this trick helps me find part of the index calculation is awry, which is most of the battle. Hope that helps.

这篇关于麻烦的计算偏移指数为3D阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆