atomicInc()不工作 [英] atomicInc() is not working
问题描述
我试过下面的程序使用atomicInc()。
__ global__ void ker(int * count)
{
int n = 1;
int x = atomicInc((unsigned int *)& count [0],n);
CUPRINTF(在内核计数为%d \\\
,count [0]);
}
int main()
{
int hitCount [1];
int * hitCount_d;
hitCount [0] = 1;
cudaMalloc((void **)& hitCount_d,1 * sizeof(int));
cudaMemcpy(& hitCount_d [0],& hitCount [0],1 * sizeof(int),cudaMemcpyHostToDevice);
ker<<< 1,4>>>(hitCount_d);
cudaMemcpy(& hitCount [0],& hitCount_d [0],1 * sizeof(int),cudaMemcpyDeviceToHost);
printf(count is%d\\\
,hitCount [0]);
return 0;
}
输出为:
内核计数为1
内核计数为1
内核计数为1
内核计数为1
count是1
我不明白为什么它不递增。任何人都可以帮助
引用文档, atomicInc
执行此操作:
:
atomicInc(unsigned int *)& count [0] n)。
计算:
code>((count [0]> = n)?0:(count [0] +1))
并将结果存回 count [0]
不确定?
操作符是什么,请查看这里)
由于您已经通过 n
= 1和计数[0]
开始于1, atomicInc
从不实际增加变量 count [0]
超过1。
如果您希望看到它的增量超过1,请为 n
传递较大的值。
变量 n
实际上作为递增过程的滚动值。当要增加的变量实际上达到 n
的值时,下一个 atomicInc
会将其重置为零。 p>
虽然你没有问这个问题,你可能会问:为什么我从来没有看到一个零值,如果我击中了翻转值?
要回答这个问题,你必须记住所有的4个线程都在锁步执行。所有4个执行 atomicInc
指令,然后再执行后续的print语句。
因此, count [0]
从1开始。
- 第三个线程将其重置为零。
- 第四个和最后一个线程将其递增为1.
p>
作为另一个实验,尝试启动5个线程,而不是4,看看你是否可以预测打印出来的值。
ker<<< 1,5>>(hitCount_d);
由于@talonmies在注释中指出,如果你交换你的 atomicInc
for a
atomicAdd
:
int x = atomicAdd ((unsigned int *)& count [0],n);
您将得到您可能预期的结果。
I have tried below program using atomicInc().
__global__ void ker(int *count)
{
int n=1;
int x = atomicInc ((unsigned int *)&count[0],n);
CUPRINTF("In kernel count is %d\n",count[0]);
}
int main()
{
int hitCount[1];
int *hitCount_d;
hitCount[0]=1;
cudaMalloc((void **)&hitCount_d,1*sizeof(int));
cudaMemcpy(&hitCount_d[0],&hitCount[0],1*sizeof(int),cudaMemcpyHostToDevice);
ker<<<1,4>>>(hitCount_d);
cudaMemcpy(&hitCount[0],&hitCount_d[0],1*sizeof(int),cudaMemcpyDeviceToHost);
printf("count is %d\n",hitCount[0]);
return 0;
}
Output is:
In kernel count is 1
In kernel count is 1
In kernel count is 1
In kernel count is 1
count is 1
I'm not understanding why it is not incrementing. Can anyone help
Referring to the documentation, atomicInc
does this:
for the following:
atomicInc ((unsigned int *)&count[0],n);
compute:
((count[0] >= n) ? 0 : (count[0]+1))
and store the result back in count[0]
(If you're not sure what the ?
operator does, look here)
Since you've passed n
= 1, and count[0]
starts out at 1, atomicInc
never actually increments the variable count[0]
beyond 1.
If you want to see it increment beyond 1, pass a larger value for n
.
The variable n
actually acts as a "rollover value" for the incrementing process. When the variable to be incremented actually reaches the value of n
, the next atomicInc
will reset it to zero.
Although you haven't asked the question, you might ask, "Why do I never see a value of zero, if I am hitting the rollover value?"
To answer this, you must remember that all 4 of your threads are executing in lockstep. All 4 of them execute the atomicInc
instruction before any execute the subsequent print statement.
Therefore we have a variable of count[0]
which starts out at 1.
- The first thread to execute the atomic resets it to zero.
- The next thread increments it to 1.
- The third thread resets it to zero.
- The fourth and final thread increments it to 1.
Then all 4 threads print out the value.
As another experiment, try launching 5 threads instead of 4, see if you can predict what the value printed out will be.
ker<<<1,5>>>(hitCount_d);
As @talonmies indicated in the comments, if you swap your atomicInc
for an atomicAdd
:
int x = atomicAdd ((unsigned int *)&count[0],n);
You'll get results that you were probably expecting.
这篇关于atomicInc()不工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!