CUDA warp中的线程是否在多处理器上并行执行? [英] Do the threads in a CUDA warp execute in parallel on a multiprocessor?

查看:222
本文介绍了CUDA warp中的线程是否在多处理器上并行执行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

经线有32个线程。 32个线程是否在多处理器中并行执行?
如果32个线程没有并行执行,那么warp中没有竞争条件。

在CUDA编程模型中,所有的线程都在平行地延伸。但是硬件中的实际执行可能不是并行的,因为SM(流多处理器)内的核心数目可以小于32.例如,GT200架构每个SM具有8个核心,并且warp内的线程将需要4个时钟周期



如果多个线程写入同一位置(共享内存或全局内存),如果不想竞赛,则必须使用原子操作或锁,因为CUDA编程模型不能保证哪个线程要写。


A warp is 32 threads. Does the 32 threads execute in parallel in a Multiprocessor? If 32 threads are not executing in parallel then there is no race condition in the warp. I got this doubt after going through the some examples.

解决方案

In the CUDA programming model, all the threads within a warp run in parallel. But the actual execution in hardware may not be parallel because the number of cores within a SM (Stream Multiprocessor) can be less than 32. For example, GT200 architecture have 8 cores per SM, and the threads within a warp would need 4 clock cycles to finish the execution.

If multiple threads write to the same location (either shared memory or global memory), and if you don't want race, then you have to use atomic operations or locks, because CUDA programming model does not guarantee which thread is going to write.

这篇关于CUDA warp中的线程是否在多处理器上并行执行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆