生成多种SIMD架构的代码 [英] Generate code for multiple SIMD architectures

查看:91
本文介绍了生成多种SIMD架构的代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个库,在其中使用CMake验证MMX,SSE,SSE2,SSE4,AVX,AVX2和AVX-512标头的存在.除此之外,我还会检查指令是否存在,如果存在,还会添加必要的编译器标志-msse2 -mavx -mfma等.

I have written a library, where I use CMake for verifying the presence of headers for MMX, SSE, SSE2, SSE4, AVX, AVX2, and AVX-512. In addition to this, I check for the presence of the instructions and if present, I add the necessary compiler flags, -msse2 -mavx -mfma etc.

这一切都很好,但是我想部署一个单一的二进制文件,该二进制文件可以在一系列处理器中使用.

This is all very good, but I would like to deploy a single binary, which works across a range of generations of processors.

问题:是否可以告诉编译器(GCC),无论何时使用SIMD优化功能时,都必须为一系列体系结构生成代码?当然,还会引入高级分支机构

Question: Is it possible to tell the compiler (GCC) that whenever it optimizes a function using SIMD, it must generate code for a list of architectures? And of of course introduce high-level branches

我在思考类似于编译器如何为函数生成代码的功能,其中输入指针是4字节或8字节对齐的.为了防止这种情况,我使用了__builtin_assume_aligned宏.

I am thinking similar to how the compiler generates code for functions, where input pointers are either 4 or 8 byte aligned. To prevent this, I use the __builtin_assume_aligned macro.

什么是最佳做法?多个二进制文件?命名吗?

What is best practice? Multiple binaries? Naming?

推荐答案

只要您不关心可移植性,就可以.

As long as you don't care about portability, yes.

最新版本的GCC通过使用

Recent versions of GCC make this easier than any other compiler I'm aware of by using the target_clones function attribute. Just add the attribute, with a list of targets you want to create versions for, and GCC will automatically create the different variants, as well as a dispatch function to choose a version automatically at runtime.

如果您想要更多的可移植性,可以使用 target 属性,clang和icc也支持,但是您必须自己编写dispatch函数(这并不困难),并多次发出该函数(通常使用宏,或重复包含标题).

If you want a bit more portability you can use the target attribute, which clang and icc also support, but you'll have to write the dispatch function yourself (which isn't difficult), and emit the function multiple times (generally using a macro, or repeatedly including a header).

AFAIK,如果您希望代码与MSVC一起使用,则需要使用不同的选项进行多次编译器调用.

AFAIK, if you want your code to work with MSVC you'll need multiple compiler invocations with different options.

这篇关于生成多种SIMD架构的代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆