OpenACC和面向对象的C ++ [英] OpenACC and object oriented C++

查看:154
本文介绍了OpenACC和面向对象的C ++的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写与OpenACC并行化的面向对象的C ++代码. 我能够找到一些关于OpenACC的stackoverflow问题和GTC讨论,但是找不到一些面向对象代码的真实示例.

I am trying to write a object oriented C++ code that is parallelized with OpenACC. I was able to find some stackoverflow questions and GTC talks on OpenACC, but I could not find some real world examples of object oriented code.

此问题中,一个OpenACCArray的示例显示在后台执行一些内存管理(代码可在 http://www.pgroup.com中找到/lit/samples/gtc15_S5233.tar ). 但是,我想知道是否有可能创建一个在更高级别上管理数组的类.例如

In this question an example for a OpenACCArray was shown that does some memory management in the background (code available at http://www.pgroup.com/lit/samples/gtc15_S5233.tar). However, I am wondering if it is possible create a class that manages the arrays on a higher level. E.g.

struct Data
{

//    OpenACCArray<float> a;

    OpenACCArray<Vector3<float>> a3;

    Data(size_t len) {
#pragma acc enter data copyin(this)
//        a.resize(len);
        a3.resize(len);
    }
    ~Data() {
#pragma acc exit data delete(this)
    }
    void update_device() {
//        a.update_device();
        a3.update_device();
    }
    void update_host() {
//        a.update_host();
        a3.update_host();
    }
};

int main(int argc, char *argv[])
{
    const size_t len = 32*128;
    Data d(len);

    d.update_device();
 #pragma acc kernels loop independent present(d)
    for (int i=0; i < len; ++i) {
     float val = (float)i/(float)len;

     d.a3[i].x = val;
     d.a3[i].y = i;
     d.a3[i].z = d.a3[i].x / d.a3[i].y;
    }
    d.update_host();
    for (int i=0; i < len/128; ++i) {
       cout << i << ": " << d.a3[i].x << "," << d.a3[i].y << "," << d.a3[i].z << endl;
    }
    cout << endl;
    return 0;
}

有趣的是,该程序可以运行,但是一旦我取消注释OpenACCArray<float> a;,即在该Data结构中添加另一个成员,就会出现内存错误. FATAL ERROR: variable in data clause is partially present on the device.

Interestingly this program works, but as soon as I uncomment OpenACCArray<float> a;, i.e. add another member to that Data struct, I get memory errors. FATAL ERROR: variable in data clause is partially present on the device.

由于OpenACCArray结构是一个平面结构,可自行处理指针的间接调用,因此应将其复制为成员吗? 还是需要成为该结构的指针,并且这些指针必须与指令进行硬连线? 然后我担心我必须使用jeff larkin在上面的建议的别名指针的问题提到的问题. 我不介意为此工作而工作,但是我找不到如何执行该操作的参考. 使用编译器指令keepgpu,keepptx可以帮助您理解编译器在做什么,但是我更喜欢使用反向工程来生成ptx代码.

Since the OpenACCArray struct is a flat structure that handles the pointer indirections on its own it should work to copy it as member? Or does need to be a pointer to the struct and the pointers have to be hardwired with directives? Then I fear the problem that I have to use alias pointers as suggested by jeff larkin at the above mentioned question. I don't mind doing the work to get this running, but I cannot find any reference how to do that. Using compiler directives keepgpu,keepptx helps a bit to understand what the compiler is doing, but I would prefer an alternative to reverse engineering generated ptx code.

任何对有用的参考项目或文档的指导都将受到高度赞赏.

Any pointers to helpful reference project or documents are highly appreciated.

推荐答案

在OpenACCArray1.h标头中,删除两个"#pragma acc enter data create(this)"编译指示.发生的事情是数据"构造函数正在设备上创建"a"和"a3"对象.因此,当在OpenACCArray构造函数中遇到第二个Enter数据区域时,该指针已经存在于设备中.

In the OpenACCArray1.h header, remove the two "#pragma acc enter data create(this)" pragmas. What's happening is that the "Data" constructor is creating the "a" and "a3" objects on the device. Hence, when the second enter data region is encountered in the OpenACCArray constructor, the device this pointer is already there.

由于"a3"和数据"为此this指针共享相同的地址,因此只有一个数据成员时,它起作用.因此,当遇到第二个输入数据编译指示时,当前检查会发现它已经在设备上,因此不再创建它.当添加"a"时,"Data"的大小是"a"的两倍,因此,当前检查发现此指针已经存在,但是大小与以前不同.这就是部分存在"错误的含义.数据在那里,但是大小与预期不同.

It works when there is only one data member since "a3" and "Data" share the same address for the this pointer. Hence when the second enter data pragma is encountered, the present check sees that it's already on the device so doesn't created it again. When "a" is added, the size of "Data" is twice that of "a", hence the present check sees that the this pointer is already there but has a different size than before. That's what the "partially present" error means. The data is there but has a different than expected size.

仅父类/结构应在设备上创建this指针.

Only the parent class/struct should create the this pointer on the device.

希望这会有所帮助, 垫子

Hope this helps, Mat

这篇关于OpenACC和面向对象的C ++的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆