cudaDeviceScheduleBlockingSync和cudaDeviceScheduleYield之间有什么区别? [英] What difference between cudaDeviceScheduleBlockingSync and cudaDeviceScheduleYield?

查看:192
本文介绍了cudaDeviceScheduleBlockingSync和cudaDeviceScheduleYield之间有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如此处所述:如何减少CUDA同步延迟/延迟

有两种方法可以等待来自设备的结果:

There are two approach for waiting result from device:

  • 轮询"-旋转刻录CPU-减少等待结果时的延迟
  • 阻塞"-线程一直处于睡眠状态,直到发生中断为止-以提高总体性能

对于轮询" ,需要使用CudaDeviceScheduleSpin.

但是对于阻止" ,我需要使用CudaDeviceScheduleYieldcudaDeviceScheduleBlockingSync?

But for "Blocking" what do I need to use CudaDeviceScheduleYield or cudaDeviceScheduleBlockingSync?

cudaDeviceScheduleBlockingSynccudaDeviceScheduleYield有什么区别?

cudaDeviceScheduleYield如下: http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__DEVICE_g18074e885b4d89f5a0fe1beab589e0c8.html 指示CUDA在等待来自设备的结果时产生线程.这可以在等待设备时增加延迟,但可以提高性能与设备并行执行工作的CPU线程." -即等待结果,没有旋转中的CPU烧损-即阻塞".还有cudaDeviceScheduleBlockingSync-等待结果而不会旋转CPU.但是有什么区别呢?

cudaDeviceScheduleYield as written: http://developer.download.nvidia.com/compute/cuda/4_1/rel/toolkit/docs/online/group__CUDART__DEVICE_g18074e885b4d89f5a0fe1beab589e0c8.html "Instruct CUDA to yield its thread when waiting for results from the device. This can increase latency when waiting for the device, but can increase the performance of CPU threads performing work in parallel with the device." - i.e. wait result without burn CPU in spin - i.e. "Blocking". And cudaDeviceScheduleBlockingSync too - wait result without burn CPU in spin. But what difference?

推荐答案

据我了解,两种方法都使用轮询进行同步.在CudaDeviceScheduleSpin的伪代码中:

For my understanding, both approaches use polling to synchronize. In pseudo-code for CudaDeviceScheduleSpin:

while (!IsCudaJobDone())
{
}

CudaDeviceScheduleYield:

while (!IsCudaJobDone())
{
     Thread.Yield();
}

CudaDeviceScheduleYield告诉操作系统它可以中断轮询线程并激活另一个线程以执行其他工作.如果CUDA作业在那一刻处于活动状态,而不是轮询一个线程,则CUDA作业完成时,这将提高CPU上其他线程的性能,但也会增加延迟.

i.e. CudaDeviceScheduleYield tells the operating system that it can interrupt the polling thread and activate another thread doing other work. This increases the performance for other threads on CPU but also increases latency, in case the CUDA job finishes when another thread than the polling one is active in that very moment.

这篇关于cudaDeviceScheduleBlockingSync和cudaDeviceScheduleYield之间有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆