CUDA驱动程序API与CUDA运行时 [英] CUDA Driver API vs. CUDA runtime

查看:651
本文介绍了CUDA驱动程序API与CUDA运行时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在编写CUDA应用程序,您可以在驱动程序级别或在运行时级别这个图片上所示(该库是CUFFT和CUBLAS的高等数学)工作:

When writing CUDA applications, you can either work at the driver level or at the runtime level as illustrated on this image (The libraries are CUFFT and CUBLAS for advanced math):

我假定两者之间的折衷是为低伊维尔基尼API,但在增加的代码复杂度为代价来提高性能。有什么具体的区别,并在那里,你可以不与高层次的API做任何显著的东西呢?

I assume the tradeoff between the two are increased performance for the low-evel API but at the cost of increased complexity of code. What are the concrete differences and are there any significant things which you cannot do with the high-level API?

我使用CUDA.net与C#互操作,这是建作为驱动API的副本。这鼓励写在C#中有很多相当复杂的代码,而C ++相当于将使用运行时API更简单。有什么用做这种方式取胜?在一个好处,我可以看到的是,它更容易集成智能错误使用C#代码的其余部分的处理。

I am using CUDA.net for interop with C# and it is built as a copy of the driver API. This encourages writing a lot of rather complex code in C# while the C++ equivalent would be more simple using the runtime API. Is there anything to win by doing it this way? The one benefit I can see is that it is easier to integrate intelligent error handling with the rest of the C# code.

推荐答案

的CUDA运行时能够编译和您的CUDA内核链接到可执行文件。这意味着,你不必与应用程序分发的cubin文件,或处理通过驱动程序API加载它们。正如已指出的那样,它通常是更容易使用。

The CUDA runtime makes it possible to compile and link your CUDA kernels into executables. This means that you don't have to distribute cubin files with your application, or deal with loading them through the driver API. As you have noted, it is generally easier to use.

在相反,驱动程序API是很难编程但设置在CUDA如何使用更多的控制。程序员有直接处理初始化,模块加载等。

In contrast, the driver API is harder to program but provided more control over how CUDA is used. The programmer has to directly deal with initialization, module loading, etc.

显然更详细的设备信息,可以通过驱动程序API不是通过运行时API查询。例如,设备上的可用存储器只能通过驱动器API查询。

Apparently more detailed device information can be queried through the driver API than through the runtime API. For instance, the free memory available on the device can be queried only through the driver API.

从CUDA程序员指南:

From the CUDA Programmer's Guide:

它由两个API组成:

It is composed of two APIs:


  • 一个低级别的API称为CUDA驱动程序API,

  • 一个更高级别的API称为CUDA运行时API,它是在
    CUDA驱动程序API之上实现

这些API是互斥的:一个应用程序应使用的一个或另一个的

These APIs are mutually exclusive: An application should use either one or the other.

的CUDA运行时通过提供隐
初始化,上下文管理,以及模块管理简化设备代码管理。 NVCC所生成的C主机代码
基于CUDA运行时(见第4.2.5节),因此链接到此代码的
应用程序必须使用CUDA运行时API。

The CUDA runtime eases device code management by providing implicit initialization, context management, and module management. The C host code generated by nvcc is based on the CUDA runtime (see Section 4.2.5), so applications that link to this code must use the CUDA runtime API.

在此相反,CUDA驱动程序API需要更多的代码,更难编程和
调试,但提供的控制更好的水平,是独立于语言的,因为它只
的交易用的cubin对象(参见4.2.5节)。尤其是,它是更难以
上配置并启动使用CUDA驱动API内核中,由于执行
配置和内核参数必须使用显式函数调用来指定
代替执行配置语法第4.2.3节中描述。此外,设备
仿真(见4.5.2.9)不与CUDA驱动程序API工作

In contrast, the CUDA driver API requires more code, is harder to program and debug, but offers a better level of control and is language-independent since it only deals with cubin objects (see Section 4.2.5). In particular, it is more difficult to configure and launch kernels using the CUDA driver API, since the execution configuration and kernel parameters must be specified with explicit function calls instead of the execution configuration syntax described in Section 4.2.3. Also, device emulation (see Section 4.5.2.9) does not work with the CUDA driver API.

有没有API的之间明显的性能差异。如何使你的内核使用内存以及它们是如何在GPU(在经纱和块)布局将有一个更明显的影响。

There is no noticeable performance difference between the API's. How your kernels use memory and how they are laid out on the GPU (in warps and blocks) will have a much more pronounced effect.

这篇关于CUDA驱动程序API与CUDA运行时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆