您如何包含标准 CUDA 库以与 NVRTC 代码链接? [英] How do you include standard CUDA libraries to link with NVRTC code?

查看:16
本文介绍了您如何包含标准 CUDA 库以与 NVRTC 代码链接?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

具体来说,我的问题是我的 CUDA 代码需要 <curand_kernel.h> 才能运行.默认情况下,这不包含在 NVRTC 中.大概在创建程序上下文时(即调用nvrtcCreateProgram),我必须发送文件名(curand_kernel.h)以及源代码curand_kernel.h?我觉得我不应该这样做.

Specifically, my issue is that I have CUDA code that needs <curand_kernel.h> to run. This isn't included by default in NVRTC. Presumably then when creating the program context (i.e. the call to nvrtcCreateProgram), I have to send in the name of the file (curand_kernel.h) and also the source code of curand_kernel.h? I feel like I shouldn't have to do that.

很难说;我还没有设法从 NVIDIA 找到一个需要像这样的标准 CUDA 文件作为源的示例,所以我真的不明白语法是什么.一些问题: curand_kernel.h 还包含...我是否必须对每个都做同样的事情?我什至不确定 NVRTC 编译器是否会在 curand_kernel.h 上正确运行,因为它不支持某些语言功能,不是吗?

It's hard to tell; I haven't managed to find an example from NVIDIA of someone needing standard CUDA files like this as a source, so I really don't understand what the syntax is. Some issues: curand_kernel.h also has includes... Do I have to do the same for each of these? I am not even sure the NVRTC compiler will even run correctly on curand_kernel.h, because there are some language features it doesn't support, aren't there?

下一步:如果你已经将头文件的源代码发送到nvrtcCreateProgram,我是否还需要在要执行的代码中#include它/如果我这样做会导致错误吗?

Next: if you've sent in the source code of a header file to nvrtcCreateProgram, do I still have to #include it in the code to be executed / will it cause an error if I do so?

执行此操作或类似操作的示例代码的链接将比简单的答案更受欢迎;我真的没找到.

A link to example code that does this or something like it would be appreciated much more than a straightforward answer; I really haven't managed to find any.

推荐答案

您必须分别发送文件名"和每个标头的来源.

You have to send the "filename" and the source of each header separately.

当预处理器完成它的工作时,它会使用任何 #include 文件名作为键来根据您提供的集合查找标头的来源.

When the preprocessor does its thing, it'll use any #include filenames as a key to find the source for the header, based on the collection that you provide.

我怀疑,在这种情况下,编译器(驱动程序)没有文件系统访问权限,因此您必须以与 OpenGL 中包含着色器的方式大致相同的方式为其提供源代码.

I suspect that, in this case, the compiler (driver) doesn't have file system access, so you have to give it the source in much the same way that you would for shader includes in OpenGL.

所以:

  • 在调用 nvrtcCreateProgram 时包含您的标头名称.编译器将在内部生成与 std::map<string,string> 等效的内容,其中包含由给定名称索引的每个标头的源.

  • Include your header's name when calling nvrtcCreateProgram. The compiler will, internally, generate the equivalent of a std::map<string,string> containing the source of each header indexed by the given name.

在您的内核源代码中,像往常一样使用 #include "foo.cuh".

In your kernel source, use #include "foo.cuh" as usual.

编译器将使用 foo.cuh 作为其内部映射(在您调用 nvrtcCreateProgram 时创建)的索引或键,并将检索标头源从那个集合中

The compiler will use foo.cuh as an index or key into its internal map (created when you called nvrtcCreateProgram), and will retrieve the header source from that collection

编译正常进行.

nvrtc 仅提供子集"功能的一个原因是编译器在某种沙盒环境中运行,而不必拥有离线编译所拥有的所有支持工具和实用程序.所以,你必须手动处理很多普通 nvcc + (gcc | MSVC| clang) 组合提供的东西.

One of the reasons that nvrtc provides only a "subset" of features is that the compiler plays in a somewhat sandboxed environment, without necessarily having all of the supporting tools and utilities lying around that you have with offline compilation. So, you have to manually handle a lot of the stuff that the normal nvcc + (gcc | MSVC| clang) combination provides.

一个可能但不理想的解决方案是在 IDE 中预处理您需要的文件,保存结果,然后 #include .但是,我敢打赌,有更好的方法可以做到这一点.如果您只想要 curand,请考虑深入库并提取您需要的部分(blech)或使用另一个 GPU 友好的 rand 实现.在较旧的 CUDA 版本上,我只是在主机上生成了大量随机浮点数,将其上传到 GPU,并在内核中对其进行采样.

A possible, but non-ideal, solution would be to preprocess the file that you need in your IDE, save the result and then #include that. However, I bet there is a better way to do that. if you just want curand, consider diving into the library and extracting the part you need (blech) or using another GPU-friendly rand implementation. On older CUDA versions, I just generated a big array of random floats on the host, uploaded it to the GPU, and sampled it in the kernels.

此相关链接可能会有所帮助.

这篇关于您如何包含标准 CUDA 库以与 NVRTC 代码链接?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆