在内核代码中检测OpenCL设备供应商 [英] Detect OpenCL device vendor in kernel code
问题描述
我正在写一些特定于平台的优化,虽然我知道我可以解析主机代码中的供应商字符串,然后使用-D
选项将其发送到内核,但这样做可能更方便直接在内核中检测供应商,而无需主机参与(这样,即使不访问主机源代码,也可以优化内核……).
I'm writing some platform specific optimizations and while I'm aware of the fact that I could parse the vendor string in the host code and send that to the kernel using the -D
option, it is perhaps more convenient to detect the vendor in the kernel directly, without host involvement (that way it is possible to optimize kernels even without access to host source code, ...).
到目前为止,我提出了以下建议:
So far, I have come up with the following:
#ifdef __NV_CL_C_VERSION
/**
* @def NVIDIA
* @brief defined when compiling on NVIDIA GPUs
*/
#define NVIDIA
#endif // __NV_CL_C_VERSION
#if defined(__WinterPark__) || defined(__BeaverCreek__) || defined(__Turks__) || \
defined(__Caicos__) || defined(__Tahiti__) || defined(__Pitcairn__) || \
defined(__Capeverde__) || defined(__Cayman__) || defined(__Barts__) || \
defined(__Cypress__) || defined(__Juniper__) || defined(__Redwood__) || \
defined(__Cedar__) || defined(__ATI_RV770__) || defined(__ATI_RV730__) || \
defined(__ATI_RV710__) || defined(__Loveland__) || defined(__GPU__) || \
defined(__Hawaii__)
#define AMD
/**
* @def AMD
* @brief defined when compiling on AMD GPUs
* @note This list was originally found at https://github.com/magnumripper/JohnTheRipper/wiki/Predefined-macros-in-OpenCL-(standard-and-proprietary) and copied shamelessly. It is most definitely incomplete and contains the troubling __GPU__.
* @note AMD also defines __CPU__ when compiling for CL_DEVICE_TYPE_CPU.
*/
#endif // ...
是否有任何补充或更正?有谁知道英特尔的定义?
Any additions or corrections? Anyone knows what Intel defines?
推荐答案
我刚刚使用1912.5
驱动程序在AMD Fury X上进行了尝试.以下三个测试均会打印该消息:
I have just tried on AMD Fury X with the 1912.5
driver. The following three tests all print the message:
#ifdef cl_amd_device_attribute_query
#pragma message "here goes AMD"
#endif
#ifdef __GPU__
#pragma message "here goes AMD GPU"
#endif
#ifdef __Fiji__
#pragma message "here goes Fiji AMD"
#endif
但是,请注意,cl_amd_device_attribute_query
对于AMD设备不是一个很好的测试,因为AMD平台还包括Intel CPU作为设备,并为其提供了相同的扩展名.闷闷不乐.
However, note that cl_amd_device_attribute_query
is not a good test for an AMD device as the AMD platform also includes the Intel CPU as a device and gives the same extension for it. Bummer.
我正在浏览amdocl64.dll
并注意到以下内容:
I was going through the amdocl64.dll
and noticed the following:
-cl-std=CL2.0
#define __clang__ 1
#define __clang_major__ 3
#define __clang_minor__ 6
#define __ENDIAN_LITTLE__ 1
#define __SPIR32 1
#define __SPIR32__ 1
#define __STDC__ 1
#define __STDC_HOSTED__ 1
#define __STDC_VERSION__ 199901L
#define __STDC_UTF_16__ 1
#define __STDC_UTF_32__ 1
#define __OPENCL_C_VERSION__ 200
#define __OPENCL_VERSION__ 200
-Wf,--force_disable_spir
-fno-lib-no-inline
-fno-sc-keep-calls
-fno-enable-dump
-cl-internal-kernel
-cl-std=CL
-cl-std=CL1.2
-just-kernel=
-DFP_FAST_FMAF=1
-DFP_FAST_FMA=1
-cl-denorms-are-zero
cl-kernel-arg-info
-fno-bin-llvmir
-fno-image-support
-mfast-fmaf
-mfast-fma kernel-arg-alignment
请注意,在此dll中找不到__GPU__
或__Fiji__
.否则似乎是一堆有趣的选择.请注意,并非所有人都可以使用,其中某些人可能需要加上-
前缀.
Note that neither __GPU__
or __Fiji__
are found in this dll. Otherwise seems like a bunch of interesting options. Note that not all of them work, some of them likely need to be prefixed with a -
.
这篇关于在内核代码中检测OpenCL设备供应商的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!