CUDA中的块间同步 [英] Inter-block synchronization in CUDA

查看：407 发布时间：2020/5/24 21:19:35 parallel-processing cuda nvidia gpu-programming

本文介绍了CUDA中的块间同步的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经搜索了一个月以解决这个问题.我无法在CUDA中同步块.

I've searched a month for this problem. I cannot synchronize blocks in CUDA.

我已经阅读了许多有关atomicAdd，合作组等的文章.我决定使用全局数组，以便一个块可以在全局数组的一个元素上写.写完之后，一个块线程等待(即被困在while循环中)，直到所有块都写入全局数组为止.

I've read a lot of posts about atomicAdd, cooperative groups, etc. I decided to use an global array so a block could write on one element of global array. After this writing, a thread of block waits(i.e. trapped in a while loop) until all blocks write global array.

当我使用3个块时，我的同步效果很好(因为我有3个SM).但是使用3个街区可让我占用12％的空间.因此，我需要使用更多的块，但是它们无法同步. 问题是:SM上的一个块等待其他块，因此SM无法获得另一个块.

When I used 3 blocks my synchronization works well (because I have 3 SM). But using 3 blocks gives me 12% occupancy. So I need to use more blocks, but they can't be synchronized. The problem is: a block on a SM waits for other blocks, so the SM can't get another block.

我该怎么办?当块的数量超过SM的数量时，如何同步块?

What can I do? How can synchronize blocks when there are blocks more than the number of SMs?

CUDA-GPU规范:CC. 6.1、3 SM，Windows 10，VS2015，GeForce MX150图形卡. 请帮我解决这个问题.我使用了很多代码，但没有一个起作用.

CUDA-GPU specification: CC. 6.1, 3 SM, windows 10, VS2015, GeForce MX150 graphic card. Please help me for this problem. I used a lot of codes but none of them works.

CUDA中的块间同步 [英] Inter-block synchronization in CUDA

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

CUDA中的块间同步 [英] Inter-block synchronization in CUDA

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭