__CUDA_ARCH__宏的行为 [英] The behavior of __CUDA_ARCH__ macro
问题描述
在宿主代码中,看来 __ CUDA_ARCH __
宏不会生成不同的代码路径,而是会为当前设备的确切代码路径生成代码.
In the host code, it seems that the __CUDA_ARCH__
macro wont generate different code path, instead, it will generate code for exact the code path for the current device.
但是,如果 __ CUDA_ARCH __
在设备代码中,它将为编译选项(/arch)中指定的不同设备生成不同的代码路径.
However, if __CUDA_ARCH__
were within device code, it will generate different code path for different devices specified in compiliation options (/arch).
任何人都可以确认这是正确的吗?
Can anyone confirm this is correct?
推荐答案
__ CUDA_ARCH __
在设备代码中使用时,将带有为其定义的数字,以反映当前正在编译的代码体系结构.
__CUDA_ARCH__
when used in device code will carry a number defined to it that reflects the code architecture currently being compiled.
它不打算在主机代码中使用.从nvcc 手册:
It is not intended to be used in host code. From the nvcc manual:
此宏可用于实现GPU功能,以确定当前为其编译的虚拟体系结构.主机代码(非GPU代码)不得依赖于它.
This macro can be used in the implementation of GPU functions for determining the virtual architecture for which it is currently being compiled. The host code (the non-GPU code) must not depend on it.
因此,主机代码中 __ CUDA_ARCH __
的用法未定义(至少由CUDA定义).正如@tera在评论中指出的那样,由于宏未在主机代码中定义,因此可以用于区分主机/设备路径,例如,在 __ host__ __device __
函数定义中.
Usage of __CUDA_ARCH__
in host code is therefore undefined (at least by CUDA). As pointed out by @tera in the comments, since the macro is undefined in host code, it could be used to differentiate host/device paths for example, in a __host__ __device__
function definition.
#ifndef __CUDA_ARCH__
//host code here
#else
//device code here
#endif
这篇关于__CUDA_ARCH__宏的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!