OpenCL 中全局内存的并发更新 (x += a) [英] Concurrent updates (x += a) to global memory in OpenCL
问题描述
我正在 OpenCL 内核中执行以下操作(简化示例):
I'm doing the following in an OpenCL kernel (simplified example):
__kernel void step(const uint count, __global int *map, __global float *sum)
{
const uint i = get_global_id(0);
if(i < count) {
sum[map[i]] += 12.34;
}
}
这里,sum
是我想计算的一些数量(之前在另一个内核中设置为零),map
是整数 i
的映射> 到整数 j
,这样多个 i
可以映射到同一个 j
.
Here, sum
is some quantity I want to calculate (previously set to zero in another kernel) and map
is a mapping from integers i
to integers j
, such that multiple i
's can map to the same j
.
(map
可以在常量内存中而不是全局内存中,但似乎我的 GPU 上的常量内存量非常有限)
(map
could be in constant memory rather than global, but it seems the amount of constant memory on my GPU is incredibly limited)
这行得通吗?一个+="是原子方式实现的,还是并发操作有可能相互覆盖?
Will this work? Is a "+=" implemented in an atomic way, or is there a chance of concurrent operations overwriting each other?
推荐答案
这行得通吗?一个+="是原子方式实现的,还是并发操作有可能相互覆盖?
Will this work? Is a "+=" implemented in an atomic way, or is there a chance of concurrent operations overwriting each other?
这是行不通的.当线程访问其他线程写入的内存时,您需要显式地求助于原子操作.在这种情况下,atomic_add
.
It will not work. When threads access memory written to by other threads, you need to explicitly resort to atomic operations. In this case, atomic_add
.
类似于:
__kernel void step(const uint count, __global int *map, __global double *sum)
{
const uint i = get_global_id(0);
if(i < count) {
atomic_add(&sum[map[i]], 1234);
}
}
这篇关于OpenCL 中全局内存的并发更新 (x += a)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!