无法生成OpenACC并行内核 [英] OpenACC parallel kernels not getting generated

查看:322
本文介绍了无法生成OpenACC并行内核的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在PGC ++上开发代码,以图形方式加速代码.

I am developing a code on PGC++ for graphically accelerating the code.

  • 我正在使用具有Eigen依赖性的OpenBabel.
  • 我尝试使用#pragma acc内核
  • 我尝试使用#pragma acc例程
  • 我的编译命令是:"pgc ++ -acc -ta = tesla -Minfo = all -I/home/pranav/new_installed/include/openbabel-2.0/-I/home/pranav/new_installed/include/eigen3/-L/home/pranav/new_installed/lib/openbabel/main.cpp/home/pranav/new_installed/lib/libopenbabel.so"

我遇到以下错误

PGCC-S-0155-Procedures called in a compute region must have acc routine information: OpenBabel::OBMol::SetTorsion(OpenBabel::OBAtom *, OpenBabel::OBAtom *, OpenBabel::OBAtom *, OpenBabel::OBAtom *, double) (main.cpp: 66)
PGCC-S-0155-Accelerator region ignored; see -Minfo messages  (main.cpp)
bondRot::two(std::vector>, OpenBabel::OBMol, int, OpenBabel::OBMol):
     11, include "bondRot.h"
           0, Accelerator region ignored
          66, Accelerator restriction: call to 'OpenBabel::OBMol::SetTorsion(OpenBabel::OBAtom *, OpenBabel::OBAtom *, OpenBabel::OBAtom *, OpenBabel::OBAtom *, double)' with no acc routine information
PGCC/x86 Linux 15.10-0: compilation completed with severe errors

注意:第66行是"mol.SetTorsion(a [0],a [1],a [2],a [3],i *(3.14159265358979323846/180));"在下面的粘贴预告片中.

NOTE: line 66 is "mol.SetTorsion(a[0],a[1],a[2],a[3],i*(3.14159265358979323846/180));" in pasted bode below.

显示此错误的我的代码如下:

my code which is showing this error is as follows:

#pragma acc routine
public:bool two(vector<OBAtom *> a)
{
std::ostringstream bestanglei,bestanglej;
for(unsigned int i=0;i<=360;i=i+res)
{
    for(unsigned int j=0;j<=360;j=j+res)
    {
        mol.SetTorsion(a[0],a[1],a[2],a[3],i*(3.14159265358979323846/180));

        //cout<<i<<"\n";
    }
}
return true;
}

从谷歌的主要搜索中,我知道这是由于mol(OBMol对象)的后向依赖性"而发生的错误.如果有人知道解决方案,请帮帮我.

From primary search on google, i got idea that this is error which is occurring because of "back dependency" of mol(OBMol object). If anyone knows the solution for it please help me out.

推荐答案

为了从设备代码中调用例程,它们必须是该例程的可用设备版本.在这种情况下,编译器无法为"OpenBabel :: OBMol :: SetTorsion"例程找到一个.您需要在此库例程的原型和定义中添加"#pragma acc例程"指令,然后使用PGI和"-acc"编译该库. SetTorsion可能调用的任何例程也将需要设备版本.

In order to call a routine from within device code, their must be an available device version of the routine. In this case, the compiler can't find one for the "OpenBabel::OBMol::SetTorsion" routine. You'll need to add a "#pragma acc routine" directive in this library routine's prototype and definition, then compile the library with PGI and "-acc". Any routines that SetTorsion might call will need device versions as well.

或者,您可以尝试内联这些例程.

Alternatively, you can try to inline these routines.

请注意,尝试从设备代码写入I/O流和文件时会遇到问题.只有有限的对未格式化stdout的支持才可用,其中所有线程的输出都被缓冲,传输回主机,然后由OS打印.

Note that you will have issues trying to write to the I/O stream and files from device code. Only limited support for unformatted stdout is available where the output from all threads are buffered, transferred back to the host, and then printed by the OS.

使用STL :: Vector也会遇到问题.除了不是线程安全的,OpenACC还不支持具有动态数据成员的聚合数据类型.如果您愿意管理结构本身中的数据,或者使用CUDA统一内存(-ta = tesla:managed),则可以使用多种方法来处理这些结构.如果您有兴趣,我会在GTC2015上发表有关此主题的演讲,您可以在以下网址查看: https://www.youtube.com/watch?v=rWLmZt_u5u4

You'll also have issues with using STL::Vector. Besides not being thread safe, aggregate data types with dynamic data members are not yet supported in OpenACC. There are ways to handle these structures if you're willing to manage the data in the structure itself, or use CUDA Unified Memory (-ta=tesla:managed). If you're interested, I gave a talk on this subject at GTC2015 which you can review at: https://www.youtube.com/watch?v=rWLmZt_u5u4

希望这会有所帮助, 垫子

Hope this helps, Mat

这篇关于无法生成OpenACC并行内核的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆