CUDA __threadfence() [英] CUDA __threadfence()
问题描述
我经历了许多论坛和nvidia手册,但我不明白什么是
__threadfence()和使用它?
I gone through many forum and nvidia manual but i couldn't understand what is __threadfence() and use of it ?
谢谢。
推荐答案
如果一个块向全局存储器写入东西,另一个块将看到它。
Normally, there are no guarantee that if one block writes something to global memory, the other block will "see" it. There is also no guarantee regarding the ordering of writes to global memory, with an exception of the block that issued it.
有两种例外情况:
- 原子操作 - 它们总是被其他块可见
- threadfence
想象一个块产生一些数据,然后使用原子操作来标记数据存在的标志。但是有可能,其他块会看到该标志,但会读取不正确或不完整的数据。
Imagine, that one block produces some data, and then uses atomic operation to mark a flag that the data is there. But it is possible, that the other block will see the flag, but will read incorrect or incomplete data.
__ threadfence
函数暂停当前线程,直到其对全局内存的写入被保证为网格中的所有其他线程可见。
__threadfence
function stalls current thread until its writes to global memory are guaranteed to be visible by all other threads in the grid. So, if you do something like:
- 存储您的数据
-
__threadfence()
- 以原子标记旗标
保证如果其他块看到标志,它也将看到数据。
it is guaranteed that if the other block sees the flag, it will also see the data.
更多阅读:Cuda编程指南,B.2.4和B.5章节
Further reading: Cuda Programming Guide, Chapters B.2.4 and B.5
这篇关于CUDA __threadfence()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!