OpenMP为内联函数声明SIMD [英] OpenMP declare SIMD for an inline function

查看:80
本文介绍了OpenMP为内联函数声明SIMD的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当前的OpenMP标准说关于C/C ++的declare simd指令:

The current OpenMP standard says about the declare simd directive for C/C++:

在函数上使用声明simd构造可创建 相关功能的SIMD版本可用于 在SIMD循环中一次调用即可处理多个参数 同时进行.

The use of a declare simd construct on a function enables the creation of SIMD versions of the associated function that can be used to process multiple arguments from a single invocation in a SIMD loop concurrently.

本章提供了更多详细信息,但似乎对该指令可以应用的函数类型没有任何限制.

More details are given in the chapter, but there seems to be no restriction there to the type of function the directive can be applied to.

所以我的问题是,可以将该指令安全地应用于inline函数吗?

So my question is, can this directive be applied safely to an inline function?

我问这个有两个原因:

  1. inline函数是一个非常不常见的函数,因为通常直接在被调用的位置内联.因此,它可能永远不会被编译为独立的函数,因此,它的declare simd方面与封闭循环级别的可能的simd指令相当冗余.
  2. 我有一个具有这样的inline declare simd函数的代码,有时,由于某些模糊的原因,GCC在链接时抱怨它们的多重定义(名称带有多余的字符,表明它们是矢量化版本).但是,如果我删除了declare simd指令,它将编译并链接正常.
  1. An inline function is a rather unusual function, since it is normally inlined directly in the place it was called. So it is likely never compiled as a standalone function and therefore, the declare simd aspect of it is quite redundant with the possible simd directive at the enclosing loop's level.
  2. I have a code with such inline declare simd functions, and sometimes, for some nebulous reasons, GCC complains about their multiple definition at link time (with names mangled with extra characters suggesting that these are vectorised versions). But if I remove the declare simd directive, it compiles and link fine.

到目前为止,我还没有考虑太多,但是现在我很困惑.是我的错误(即对inline函数使用declare simd)还是在GCC中生成inline函数的二进制矢量化版本并且无法在链接时对其进行分类的问题?

So far I hadn't think too much about it, but now I'm puzzled. Is that a bug of mine (ie using declare simd for inline functions) or is that a problem in GCC generating binary vectorised versions of inline functions and failing to sort them out at link time?


有一个GCC编译器选项可以发挥作用.启用内联时(例如,使用-O3),代码将编译并正确链接.但是,当使用-O0-O3 -fno-inline进行编译时,内联被禁用,并且由于使用omp declare simd指令修饰的函数的多重定义",链接失败.


There is a GCC compiler options which makes a difference. When the inlining is enabled (with -O3 for example), the code compiles and links fine. But when compiled with -O0 or with -O3 -fno-inline, the inlining is disabled and the linking fails with this "multiple definition of" the function decorated with the omp declare simd directive.


感谢@Zboson有关编译器标志的问题,我设法创建了一个复制器.在这里:

EDIT 2:
Thanks to @Zboson questions regarding the compiler flags, I managed to create a reproducer. Here it is:

foobar.h :

#ifndef FOOBAR_H_
#define FOOBAR_H_

#include <cmath>

#pragma omp declare simd
inline double foo( double d ) {
    return sin( cos( exp( d ) ) );
}

double bar( double *v, int len );

#endif

foobar.cc :

#include "foobar.h"

double bar( double *v, int len ) {
    double sum = 0;
    for ( int i = 0; i < len; i++ ) {
        sum += foo( v[i] );
    }
    return sum;
}

simd.cc :

#include <iostream>
#include "foobar.h"

int main() {

    const int len = 100;
    double *v = new double[len];

    for ( int i = 0; i < len; i++ ) {
        v[i] = i;
    }

    double sum = 0;
    #pragma omp simd reduction( +: sum )
    for ( int i = 0; i < len; i++ ) {
        sum += foo( v[i] );
    }

    std::cout << sum << "  " << bar( v, len ) << std::endl;

    delete[] v;

    return 0;
}

编译:

> g++ -fopenmp -g simd.cc foobar.cc
/tmp/ccI4e7ip.o: In function `_ZGVbN2v__Z3food':
foobar.h:7: multiple definition of `_ZGVbN2v__Z3food'
/tmp/cc4U8Qyu.o:foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVbM2v__Z3food':
foobar.h:7: multiple definition of `_ZGVbM2v__Z3food'
/tmp/cc4U8Qyu.o:foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVcN4v__Z3food':
foobar.h:7: multiple definition of `_ZGVcN4v__Z3food'
/tmp/cc4U8Qyu.o:foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVcM4v__Z3food':
foobar.h:7: multiple definition of `_ZGVcM4v__Z3food'
foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVdN4v__Z3food':
foobar.h:7: multiple definition of `_ZGVdN4v__Z3food'
foobar.h:7: first defined here
/tmp/ccI4e7ip.o: In function `_ZGVdM4v__Z3food':
foobar.h:7: multiple definition of `_ZGVdM4v__Z3food'
foobar.h:7: first defined here
collect2: error: ld returned 1 exit status
> c++filt _ZGVdM4v__Z3food
_ZGVdM4v__Z3food
> c++filt _Z3food
foo(double)

Gcc版本4.9.2和5.1.0都给出了相同的问题,而Intel编译器版本15.0.3对其进行编译就很好.

Gcc versions 4.9.2 and 5.1.0 both give the very same problem, while the Intel compiler version 15.0.3 compiles it just fine.

最终编辑:
Hristo Iliev的评论Z玻色子的问题使我感到安慰,因为我的代码符合OpenMP,并且这是GCC中的错误.我将使用可以找到的最新版本进行进一步的测试,并在需要时进行报告.

Final edit:
Hristo Iliev's comment and Z boson's question comfort me in the idea that my code is OpenMP compliant, and that this is a bug in GCC. I'll see to make further tests with the most up-to-date version I can find, and report it if needed.

推荐答案

内联函数是一个非常不寻常的函数,因为通常直接在被调用的位置内联.因此,很可能永远都不会将其编译为独立函数.

An inline function is a rather unusual function, since it is normally inlined directly in the place it was called. So it is likely never compiled as a standalone function.

这是不正确的.带有或不带有内联的函数(除非声明为static)具有外部链接.如果从另一个目标文件中调用该函数,则编译器必须生成该函数的独立版本(不会内联).如果您不想使用独立功能,请声明功能static.参见8.3节和Agner Fog的在C ++中优化软件的标题内联函数具有非内联副本" 以获得更多详细信息.

This is incorrect. A function with or without inline unless declared static has external linkage. The compiler has to produce a stand-alone version of the function (which won't be inlined) in case the function is called from another object file. If you don't want a standalone function declare the function static. See section 8.3 und the heading "Inlined functions have a non-inlined copy" in Agner Fog's Optimizing software in C++ for more details.

使用static inline double foo不会给您的代码带来错误.

Using static inline double foo does not give an error with your code.

现在让我们看一下符号.不使用static

Now let's look at the symbols. Without using static

nm foobar.o | grep foo

给予

W _Z3food
T _ZGVbM2v__Z3food
T _ZGVbN2v__Z3food
T _ZGVcM4v__Z3food
T _ZGVcN4v__Z3food
T _ZGVdM4v__Z3food
T _ZGVdN4v__Z3food

nm foobar.o | grep foo给出相同的内容.

大写的"W"和"T"表示符号在外部.但是,"W"是弱符号,不会引起链接错误,但是"T"是这样做的强烈象征.因此,这表明了链接器抱怨的原因.

The uppercase "W" and "T" mean the symbols are external. However "W" is a weak symbol which does not cause a link error however "T" is a strong symbol which does. So this shows why the linker is complaining.

static inline的结果是什么?在这种情况下,nm foobar.o | grep foo给出

What's the result with static inline? In this case nm foobar.o | grep foo gives

t _ZGVbM2v__ZL3food
t _ZGVbN2v__ZL3food
t _ZL3food

和nm simd.o | grep foo给出相同的结果.但是小写的"t"表示符号具有局部链接,因此链接器没有问题.

and nm simd.o | grep foo gives the same thing. But lowercase "t" means the symbols have local linkage and so there is no problem with the linker.

如果不使用OpenMP进行编译,则生成的唯一foo符号是_ZL3food.我不知道为什么GCC会为该功能的非SIMD版本生成弱符号,而为SIMD版本生成强符号,所以我无法完全回答您的问题,但是我仍然认为此信息会很有趣.

If we compile without OpenMP the only foo symbol produced is _ZL3food. I don't know why GCC is producing weak symbols for the non-SIMD version of the function and strong symbols for the SIMD version so I can't completely answer your question but I thought this information would be interesting nevertheless.

这篇关于OpenMP为内联函数声明SIMD的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆