为什么我们需要cudaDeviceSynchronize();在带有device-printf的内核中? [英] why do we need cudaDeviceSynchronize(); in kernels with device-printf?

查看:1621
本文介绍了为什么我们需要cudaDeviceSynchronize();在带有device-printf的内核中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

__global__ void helloCUDA(float f)
{
    printf("Hello thread %d, f=%f\n", threadIdx.x, f);
}

int main()
{
    helloCUDA<<<1, 5>>>(1.2345f);
    cudaDeviceSynchronize();
    return 0;
}

为什么cudaDeviceSynchronize();在很多地方,例如此处并非如此

Why is cudaDeviceSynchronize(); at many places for example here it is not required after kernel call?

推荐答案

内核启动是异步。这意味着在内核完成执行之前,它将在启动GPU进程后立即将控制权返回给CPU线程。

A kernel launch is asynchronous. This means it returns control to the CPU thread immediately after starting up the GPU process, before the kernel has finished executing.

那么,CPU线程的下一步是什么? ?应用程序退出。

So what is the next thing in the CPU thread here? Application exit.

在应用程序退出时,其将输出发送到标准输出的功能由操作系统终止。

At application exit, it's ability to send output to the standard output is terminated by the OS.

因此内核以后生成的输出无处可去,您将看不到它。

Thus the output that is generated later by the kernel has nowhere to go, and you won't see it.

另一方面,如果使用 cudaDeviceSynchronize(),然后保证可以完成内核(内核的输出将找到一个等待的标准输出队列),在允许应用程序之前退出。

On the other hand, if you use cudaDeviceSynchronize(), then the kernel is guaranteed to finish (and the output from the kernel will find a waiting standard output queue), before the application is allowed to exit.

这篇关于为什么我们需要cudaDeviceSynchronize();在带有device-printf的内核中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆