__CUDA_ARCH__宏的行为 [英] The behavior of __CUDA_ARCH__ macro

查看:171
本文介绍了__CUDA_ARCH__宏的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在宿主代码中,看来 __ CUDA_ARCH __ 宏不会生成不同的代码路径,而是会为当前设备的确切代码路径生成代码.

In the host code, it seems that the __CUDA_ARCH__ macro wont generate different code path, instead, it will generate code for exact the code path for the current device.

但是,如果 __ CUDA_ARCH __ 在设备代码中,它将为编译选项(/arch)中指定的不同设备生成不同的代码路径.

However, if __CUDA_ARCH__ were within device code, it will generate different code path for different devices specified in compiliation options (/arch).

任何人都可以确认这是正确的吗?

Can anyone confirm this is correct?

推荐答案

__ CUDA_ARCH __ 在设备代码中使用时,将带有为其定义的数字,以反映当前正在编译的代码体系结构.

__CUDA_ARCH__ when used in device code will carry a number defined to it that reflects the code architecture currently being compiled.

它不打算在主机代码中使用.从nvcc 手册:

It is not intended to be used in host code. From the nvcc manual:

此宏可用于实现GPU功能,以确定当前为其编译的虚拟体系结构.主机代码(非GPU代码)不得依赖于它.

This macro can be used in the implementation of GPU functions for determining the virtual architecture for which it is currently being compiled. The host code (the non-GPU code) must not depend on it.

因此,主机代码中 __ CUDA_ARCH __ 的用法未定义(至少由CUDA定义).正如@tera在评论中指出的那样,由于宏未在主机代码中定义,因此可以用于区分主机/设备路径,例如,在 __ host__ __device __ 函数定义中.

Usage of __CUDA_ARCH__ in host code is therefore undefined (at least by CUDA). As pointed out by @tera in the comments, since the macro is undefined in host code, it could be used to differentiate host/device paths for example, in a __host__ __device__ function definition.

#ifndef __CUDA_ARCH__
//host code here
#else
//device code here
#endif

这篇关于__CUDA_ARCH__宏的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆