任何特定的函数来初始化GPU而不是第一个cudaMalloc调用？ [英] Any particular function to initialize GPU other than the first cudaMalloc call?

查看：593 发布时间：2017/3/4 11:48:52 cuda gpu

本文介绍了任何特定的函数来初始化GPU而不是第一个cudaMalloc调用？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

第一个cudaMalloc调用是慢的（如0.2秒），因为在GPU上的一些初始化工作。有没有任何功能，只做初始化，这样我可以分开的时间？ cudaSetDevice似乎将时间减少到0.15秒，但仍然不能消除所有init开销。

The first cudaMalloc call is slow (like 0.2 sec) because of some initialization work on GPU. Is there any function that solely do initialization, so that I can separate the time? cudaSetDevice seems to reduce the time to 0.15 secs, but still does not eliminate all init overheads.

推荐答案

呼叫

cudaFree(0);

是在CUDA运行时强制延迟上下文建立的规范方法。您不能减少开销，这是驱动程序，运行时和操作系统延迟的函数。

is the canonical way to force lazy context establishment in the CUDA runtime. You can't reduce the overhead, that is a function of driver, runtime and operating system latencies. But the call above will let you control how/when those overheads occur during program execution.

在2015年编辑添加了上下文的启发式运行时API中的初始化随着时间的推移发生了微妙的变化，因此 cudaSetDevice 现在建立了上下文，因此 cudaFree（）没有明确要求初始化上下文，可以使用 cudaSetDevice 。还要注意，在第一次内核启动时仍然会产生一些设置时间，而在这之前不是这样。对于内核时序，最好在启动内核之前包括热身调用，以便及时删除此设置延迟。看来，各种性能分析工具有足够的粒度，以避免这种情况，而无需任何额外的API调用或内核调用。

EDIT in 2015 to add that the heuristics of context initialisation in the runtime API have subtly changed over time so that cudaSetDevice now establishes a context, so the cudaFree() call isn't explicitly required to intialise a context, you can use cudaSetDeviceinstead. Also note that some set-up time will still be incurred at the first kernel launch, whereas before this wasn't the case. For for kernel timing, it is best to include a warm-up call first before launching the kernel you will time to remove this set-up latency. It appears that the various profiling tools have enough granularity built in to avoid this without any extra API calls or kernel calls.

这篇关于任何特定的函数来初始化GPU而不是第一个cudaMalloc调用？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

任何特定的函数来初始化GPU而不是第一个cudaMalloc调用？ [英] Any particular function to initialize GPU other than the first cudaMalloc call?

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录关闭

任何特定的函数来初始化GPU而不是第一个cudaMalloc调用？ [英] Any particular function to initialize GPU other than the first cudaMalloc call?

问题描述

推荐答案

相关文章

其它硬件开发最新文章

热门教程

热门工具

登录 关闭

登录关闭