是否可以使编译器按功能使用快速运算符? [英] Can I make my compiler use fast-math on a per-function basis?

查看:98
本文介绍了是否可以使编译器按功能使用快速运算符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有

template <bool UsesFastMath> void foo(float* data, size_t length);

并且我想用-ffast-math编译一个实例(对于nvcc是--use-fast-math),而没有它的另一个实例.

and I want to compile one instantiation with -ffast-math (--use-fast-math for nvcc), and the other instantiation without it.

这可以通过在单独的翻译单元中实例化每个变体,然后使用不同的命令行(使用和不使用开关)来编译每个变体来实现.

This can be achieved by instantiating each of the variants in a separate translation unit, and compiling each of them with a different command-line - with and without the switch.

我的问题是,是否可以指示流行的编译器(*)对单个函数应用-ffast-math或不应用-ffast-math-这样我就可以在同一翻译单元中使用我的实例化.

My question is whether it's possible to indicate to popular compilers (*) to apply or not apply -ffast-math for individual functions - so that I'll be able to have my instantiations in the same translation unit.

注释:

  • 如果答案为否",则为解释为什么不提供加分.
  • 这与这个问题的问题不同,后者是关于在运行时打开和关闭快速运算的.我要谦虚得多...
  • If the answer is "no", bonus points for explaining why not.
  • This is not the same questions as this one, which is about turning fast-math on and off at runtime. I'm much more modest...

(*)由流行的编译器提供,我的意思是:gcc,clang,msvc icc,nvcc(用于GPU内核代码)中有关您的信息.

(*) by popular compilers I mean any of: gcc, clang, msvc icc, nvcc (for GPU kernel code) about which you have that information.

推荐答案

从CUDA 7.5(我熟悉的最新版本开始,尽管CUDA 8.0目前正在发布),nvcc不会 支持函数属性,这些属性使程序员可以在每个函数的基础上应用特定的编译器优化.

As of CUDA 7.5 (the latest version I am familiar with, although CUDA 8.0 is currently shipping), nvcc does not support function attributes that allow programmers to apply specific compiler optimizations on a per-function basis.

由于通过命令行开关设置的优化配置适用于整个编译单元,因此一种可行的方法是使用与存在不同优化配置一样多的不同编译单元,正如问题中已经提到的那样;源代码可以共享,也可以从一个公共文件中#include -ed.

Since optimization configurations set via command line switches apply to the entire compilation unit, one possible approach is to use as many different compilation units as there are different optimization configurations, as already noted in the question; source code may be shared and #include-ed from a common file.

使用nvcc,命令行开关--use_fast_math基本上控制了三个功能区域:

With nvcc, the command line switch --use_fast_math basically controls three areas of functionality:

  • 启用了清零模式(即,禁用了非常规支持)
  • 单精度倒数,除法和平方根切换为近似版本
  • 某些标准数学函数被等效的,精度较低的内在函数取代

您可以通过使用适当的内在函数,按操作粒度应用其中一些更改,而其他一些则可以通过使用PTX内联汇编来应用.

You can apply some of these changes with per-operation granularity by using appropriate intrinsics, others by using PTX inline assembly.

这篇关于是否可以使编译器按功能使用快速运算符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆