CUDA __threadfence() [英] CUDA __threadfence()

查看:534
本文介绍了CUDA __threadfence()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我经历了许多论坛和nvidia手册,但我不明白什么是
__threadfence()和使用它?

I gone through many forum and nvidia manual but i couldn't understand what is __threadfence() and use of it ?

谢谢。

推荐答案

如果一个块向全局存储器写入东西,另一个块将看到它。

Normally, there are no guarantee that if one block writes something to global memory, the other block will "see" it. There is also no guarantee regarding the ordering of writes to global memory, with an exception of the block that issued it.

有两种例外情况:


  • 原子操作 - 它们总是被其他块可见

  • threadfence

想象一个块产生一些数据,然后使用原子操作来标记数据存在的标志。但是有可能,其他块会看到该标志,但会读取不正确或不完整的数据。

Imagine, that one block produces some data, and then uses atomic operation to mark a flag that the data is there. But it is possible, that the other block will see the flag, but will read incorrect or incomplete data.

__ threadfence 函数暂停当前线程,直到其对全局内存的写入被保证为网格中的所有其他线程可见。

__threadfence function stalls current thread until its writes to global memory are guaranteed to be visible by all other threads in the grid. So, if you do something like:


  1. 存储您的数据

  2. __threadfence()

  3. 以原子标记旗标

保证如果其他块看到标志,它也将看到数据。

it is guaranteed that if the other block sees the flag, it will also see the data.

更多阅读:Cuda编程指南,B.2.4和B.5章节

Further reading: Cuda Programming Guide, Chapters B.2.4 and B.5

这篇关于CUDA __threadfence()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆