如何在编译时检测SSE / SSE2 / AVX / AVX2 / AVX-512 / AVX-128-FMA / KCVI的可用性？ [英] How to detect SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI availability at compile-time?

查看：1119 发布时间：2018/4/20 16:31:30 gcc clang sse avx avx512

本文介绍了如何在编译时检测SSE / SSE2 / AVX / AVX2 / AVX-512 / AVX-128-FMA / KCVI的可用性？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图优化一些矩阵计算，我想知道是否可以在编译时检测SSE / SSE2 / AVX / AVX2 / AVX-512 / AVX-128-FMA / KCVI ^1]由编译器启用？对于GCC和Clang来说，理想的情况是，但我只能使用其中的一个进行管理。

我不确定这是可能的，也许我会用我自己的宏，但我宁愿检测它，并要求用户选择它。

^[1]KCVI代表Knights Corner Vector Instruction optimizations。像FFTW这样的库检测/利用这些新的指令优化。大多数编译器会自动定义：

解决方案

  __ SSE__ 
 __SSE2__ 
 __SSE3__ 
 __AVX__ 
 __AVX2__

等，根据您传递的任何命令行开关。您可以使用gcc（或与gcc兼容的编译器，例如clang）轻松检查此内容：

  $ gcc -msse3  - dM -E  - / dev / null | egrepSSE | AVX| sort 
 #define __SSE__ 1 
 #define __SSE2__ 1 
 #define __SSE2_MATH__ 1 
 #define __SSE3__ 1 
 #define __SSE_MATH__ 1

或：

  $ gcc -mavx2 -dM -E  - < / dev / null | egrepSSE | AVX| sort 
 #define __AVX__ 1 
 #define __AVX2__ 1 
 #define __SSE__ 1 
 #define __SSE2__ 1 
 #define __SSE2_MATH__ 1 
 #define __SSE3__ 1 
 #define __SSE4_1__ 1 
 #define __SSE4_2__ 1 
 #define __SSE_MATH__ 1 
 #define __SSSE3__ 1

或只检查预定义的宏以在特定平台上进行默认构建：

  $ gcc -dM -E  - < / dev / null | egrepSSE | AVX| sort 
 #define __SSE2_MATH__ 1 
 #define __SSE2__ 1 
 #define __SSE3__ 1 
 #define __SSE_MATH__ 1 
 #define __SSE__ 1 
 #define __SSSE3__ 1

更新的英特尔处理器支持AVX-512，它不是单片指令集。你可以在下面的两个例子中看到GCC（版本6.2）提供的支持。

这里是Knights Landing：
$ gcc -march = knl -dM -E - < / dev / null | egrepSSE | AVX| sort #define __AVX__ 1 #define __AVX2__ 1 #define __AVX512CD__ 1 #define __AVX512ER__ 1 #define __AVX512F__ 1 #define __AVX512PF__ 1 #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE4_1__ 1 #define __SSE4_2__ 1 #define __SSE_MATH__ 1 #define __SSSE3__ 1
以下是Skylake AVX-512：
$ gcc -march = skylake-avx512 -dM -E - < / dev / null | egrepSSE | AVX| sort #define __AVX__ 1 #define __AVX2__ 1 #define __AVX512BW__ 1 #define __AVX512CD__ 1 #define __AVX512DQ__ 1 #define __AVX512F__ 1 #define __AVX512VL__ 1 #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE4_1__ 1 #define __SSE4_2__ 1 #define __SSE_MATH__ 1 #define __SSSE3__ 1
英特尔已经公布了其他AVX-512子集（请参阅 ISA扩展）。 GCC（版本7）支持与AVX-512的4FMAPS，4VNNIW，IFMA，VBMI和VPOPCNTDQ子集关联的编译器标志和预处理符号：

for i in 4fmaps 4vnniw ifma vbmi vpopcntdq;做回声==== $ i ====; gcc -mavx512 $ i -dM -E - < / dev / null | egrepAVX512|排序done ==== 4fmaps ==== #define __AVX5124FMAPS__ 1 #define __AVX512F__ 1 ==== 4vnniw ==== #define __AVX5124VNNIW__ 1 #define __AVX512F__ 1 ==== ifma ==== #define __AVX512F__ 1 #define __AVX512IFMA__ 1 ==== vbmi === = #define __AVX512BW__ 1 #define __AVX512F__ 1 #define __AVX512VBMI__ 1 ==== vpopcntdq ==== #define __AVX512F__ 1 #define __AVX512VPOPCNTDQ__ 1

I'm trying to optimize some matrix computations and I was wondering if it was possible to detect at compile-time if SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI^[1] is enabled by the compiler ? Ideally for GCC and Clang, but I can manage with only one of them.

I'm not sure it is possible and perhaps I will use my own macro, but I'd prefer detecting it rather and asking the user to select it.

^[1] "KCVI" stands for Knights Corner Vector Instruction optimizations. Libraries like FFTW detect/utilize these newer instruction optimizations.
解决方案
Most compilers will automatically define:
__SSE__ __SSE2__ __SSE3__ __AVX__ __AVX2__
etc, according to whatever command line switches you are passing. You can easily check this with gcc (or gcc-compatible compilers such as clang), like this:
$ gcc -msse3 -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE_MATH__ 1
or:
$ gcc -mavx2 -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __AVX__ 1 #define __AVX2__ 1 #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE4_1__ 1 #define __SSE4_2__ 1 #define __SSE_MATH__ 1 #define __SSSE3__ 1
or to just check the pre-defined macros for a default build on your particular platform:
$ gcc -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __SSE2_MATH__ 1 #define __SSE2__ 1 #define __SSE3__ 1 #define __SSE_MATH__ 1 #define __SSE__ 1 #define __SSSE3__ 1
More recent Intel processors support AVX-512, which is not a monolithic instruction set. One can see the support available from GCC (version 6.2) for two examples below.

Here is Knights Landing:
$ gcc -march=knl -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __AVX__ 1 #define __AVX2__ 1 #define __AVX512CD__ 1 #define __AVX512ER__ 1 #define __AVX512F__ 1 #define __AVX512PF__ 1 #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE4_1__ 1 #define __SSE4_2__ 1 #define __SSE_MATH__ 1 #define __SSSE3__ 1
Here is Skylake AVX-512:
$ gcc -march=skylake-avx512 -dM -E - < /dev/null | egrep "SSE|AVX" | sort #define __AVX__ 1 #define __AVX2__ 1 #define __AVX512BW__ 1 #define __AVX512CD__ 1 #define __AVX512DQ__ 1 #define __AVX512F__ 1 #define __AVX512VL__ 1 #define __SSE__ 1 #define __SSE2__ 1 #define __SSE2_MATH__ 1 #define __SSE3__ 1 #define __SSE4_1__ 1 #define __SSE4_2__ 1 #define __SSE_MATH__ 1 #define __SSSE3__ 1
Intel has disclosed additional AVX-512 subsets (see ISA extensions). GCC (version 7) supports compiler flags and preprocessor symbols associated with the 4FMAPS, 4VNNIW, IFMA, VBMI and VPOPCNTDQ subsets of AVX-512:
for i in 4fmaps 4vnniw ifma vbmi vpopcntdq ; do echo "==== $i ====" ; gcc -mavx512$i -dM -E - < /dev/null | egrep "AVX512" | sort ; done ==== 4fmaps ==== #define __AVX5124FMAPS__ 1 #define __AVX512F__ 1 ==== 4vnniw ==== #define __AVX5124VNNIW__ 1 #define __AVX512F__ 1 ==== ifma ==== #define __AVX512F__ 1 #define __AVX512IFMA__ 1 ==== vbmi ==== #define __AVX512BW__ 1 #define __AVX512F__ 1 #define __AVX512VBMI__ 1 ==== vpopcntdq ==== #define __AVX512F__ 1 #define __AVX512VPOPCNTDQ__ 1

这篇关于如何在编译时检测SSE / SSE2 / AVX / AVX2 / AVX-512 / AVX-128-FMA / KCVI的可用性？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在编译时检测SSE / SSE2 / AVX / AVX2 / AVX-512 / AVX-128-FMA / KCVI的可用性？ [英] How to detect SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI availability at compile-time?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何在编译时检测SSE / SSE2 / AVX / AVX2 / AVX-512 / AVX-128-FMA / KCVI的可用性？ [英] How to detect SSE/SSE2/AVX/AVX2/AVX-512/AVX-128-FMA/KCVI availability at compile-time?

问题描述

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭