使用PyOpenCL进行结构对齐 [英] Struct Alignment with PyOpenCL

查看:122
本文介绍了使用PyOpenCL进行结构对齐的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新:内核中的int4错误.

update: the int4 in my kernel was wrong.

我正在使用pyopencl,但是无法使结构对齐正常工作.在下面的代码(两次调用内核)中,正确返回了b值(为1),但是c值具有一些随机"值.

I am using pyopencl but am unable to get struct alignment to work correctly. In the code below, which calls the kernel twice, the b value is returned correctly (as 1), but the c value has some "random" value.

换句话说: 我正在尝试读取结构的两个成员.我可以阅读第一个,但不能阅读第二个.为什么?

In other words: I am trying to read two members of a struct. I can read the first but not the second. Why?

无论我使用numpy结构化数组还是使用struct打包,都会发生相同的问题.而且注释中的_-attribute__设置也无济于事.

The same issue occurs whether I use numpy structured arrays or pack with struct. And the _-attribute__ settings in the comments don't help either.

我怀疑我在代码中的其他地方正在做一些愚蠢的操作,但是看不到它.任何帮助表示赞赏.

I suspect I am doing something stupid elsewhere in the code, but can't see it. Any help appreciated.

import struct as s
import pyopencl as cl
import numpy as n

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

for use_struct in (True, False):

    if use_struct:
        a = s.pack('=ii',1,2)
        print(a, len(a))
        a_dev = cl.Buffer(ctx, cl.mem_flags.WRITE_ONLY, len(a))
    else:
#       a = n.array([(1,2)], dtype=n.dtype('2i4', align=True))
        a = n.array([(1,2)], dtype=n.dtype('2i4'))
        print(a, a.itemsize, a.nbytes)
        a_dev = cl.Buffer(ctx, cl.mem_flags.WRITE_ONLY, a.nbytes)

    b = n.array([0], dtype='i4')
    print(b, b.itemsize, b.nbytes)
    b_dev = cl.Buffer(ctx, cl.mem_flags.READ_ONLY, b.nbytes)

    c = n.array([0], dtype='i4')
    print(c, c.itemsize, c.nbytes)
    c_dev = cl.Buffer(ctx, cl.mem_flags.READ_ONLY, c.nbytes)

    prg = cl.Program(ctx, """
        typedef struct s {
            int4 f0;
            int4 f1 __attribute__ ((packed));
//            int4 f1 __attribute__ ((aligned (4)));
//            int4 f1;
        } s;
        __kernel void test(__global const s *a, __global int4 *b, __global int4 *c) {
            *b = a->f0;
            *c = a->f1;
        }
        """).build()

    cl.enqueue_copy(queue, a_dev, a)
    event = prg.test(queue, (1,), None, a_dev, b_dev, c_dev)
    event.wait()
    cl.enqueue_copy(queue, b, b_dev)
    print(b)
    cl.enqueue_copy(queue, c, c_dev)
    print(c)

输出(我在剪切+粘贴时必须重新格式化,因此可能会稍微弄乱换行符;我还添加了注释,指示各种打印值是什么):

The output (I had to reformat while cut+pasting, so may have messed up line breaks slightly; I've also added comments indicating what the various print values are):

# first using struct
/home/andrew/projects/personal/kultrung/env/bin/python3.2 /home/andrew/projects/personal/kultrung/src/kultrung/test6.py
b'\x01\x00\x00\x00\x02\x00\x00\x00' 8 # the struct packed values
[0] 4 4                               # output buffer 1
[0] 4 4                               # output buffer 2
/home/andrew/projects/personal/kultrung/env/lib/python3.2/site-packages/pyopencl/cache.py:343: UserWarning: Build succeeded, but resulted in non-empty logs: Build on <pyopencl.Device 'Intel(R) Core(TM)2 CPU         T5600  @ 1.83GHz' at 0x1385a20> succeeded, but said:

Build started Kernel <test> was successfully vectorized Done.   warn("Build succeeded, but resulted in non-empty logs:\n"+message)
[1]         # the first value (correct)
[240]       # the second value (wrong)

# next using numpy
[[1 2]] 4 8 # the numpy struct
[0] 4 4     # output buffer
[0] 4 4     # output buffer
/home/andrew/projects/personal/kultrung/env/lib/python3.2/site-packages/pyopencl/__init__.py:174: UserWarning: Build succeeded, but resulted in non-empty logs: Build on <pyopencl.Device 'Intel(R) Core(TM)2 CPU         T5600  @ 1.83GHz' at 0x1385a20> succeeded, but said:

Build started Kernel <test> was successfully vectorized Done.   warn("Build succeeded, but resulted in non-empty logs:\n"+message)
[1]        # first value (ok)
[67447488] # second value (wrong)

Process finished with exit code 0

推荐答案

好,我不知道我从哪里得到int4-我认为它必须是intel扩展名.使用int切换到AMD,因为内核类型按预期工作.我会在 http://acooke.org/cute/Somesimple0.html 上发布更多信息已经清理过了.

ok, i don't know where i got int4 from - i think it must be an intel extension. switching to AMD with int as the kernel type works as expected. i'll post more at http://acooke.org/cute/Somesimple0.html once i have cleaned things up.

这篇关于使用PyOpenCL进行结构对齐的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆