Cython:了解具有indirect_contignuous内存布局的类型化memoryview [英] Cython: understanding a typed memoryview with a indirect_contignuous memory layout
问题描述
我想了解有关Cython很棒的 typed-memoryviews 和内存布局 indirect_contiguous
。
I want to understand more about Cython's awesome typed-memoryviews and the memory layout indirect_contiguous
.
根据文档 indirect_contiguous $ c $当指针列表是连续的 时使用c>。
According to the documentation indirect_contiguous
is used when "the list of pointers is contiguous".
还有一个示例用法:
There's also an example usage:
# contiguous list of pointers to contiguous lists of ints
cdef int[::view.indirect_contiguous, ::1] b
因此,如果我错了,请指正我,但我假设一个连续指针的连续列表整数列表 表示类似于以下c ++伪代码创建的数组:
So pls correct me if I'm wrong but I assume a "contiguous list of pointers to contiguous lists of ints" means something like the array created by the following c++ dummy-code:
// we want to create a 'contiguous list of pointers to contiguous lists of ints'
int** array;
// allocate row-pointers
// This is the 'contiguous list of pointers' related to the first dimension:
array = new int*[ROW_COUNT]
// allocate some rows, each row is a 'contiguous list of ints'
array[0] = new int[COL_COUNT]{1,2,3}
因此,如果我理解正确,那么在我的Cython代码中,应该可以从 int ** 像这样:
So if I understand correctly then in my Cython code it should be possible to get a memoryview from a int**
like this:
cdef int** list_of_pointers = get_pointers()
cdef int[::view.indirect_contiguous, ::1] view = <int[:ROW_COUNT:view.indirect_contiguous,COL_COUNT:1]> list_of_pointers
但是我遇到了编译错误:
But I get Compile-errors:
cdef int[::view.indirect_contiguous, ::1] view = <int[:ROW_COUNT:view.indirect_contiguous,:COL_COUNT:1]> list_of_pointers
^
------------------------------------------------------------
memview_test.pyx:76:116: Pointer base type does not match cython.array base type
我做错了什么?
我是否缺少任何类型转换或是否误解了indirect_contiguous的概念?
what did I do wrong? Am I missing any casts or did I misunderstand the concept of indirect_contiguous?
推荐答案
让我们将记录设置为直接:内存视图只能与实现 buffer-protocol <的对象一起使用/ a>。
Let's set the record straight: typed memory view can be only used with objects which implement buffer-protocol.
原始C指针显然未实现缓冲区协议。但是您可能会问,为什么下面的快速脏代码会起作用:
Raw C-pointers obviously don't implement the buffer-protocol. But you might ask, why something like the following quick&dirty code works:
%%cython
from libc.stdlib cimport calloc
def f():
cdef int* v=<int *>calloc(4, sizeof(int))
cdef int[:] b = <int[:4]>v
return b[0] # leaks memory, so what?
此处,指针( v
)为用于构造类型化的内存视图( b
)。但是,还有更多内容(在cythonized c文件中可以看到):
Here, a pointer (v
) is used to construct a typed memory view (b
). There is however more, going under the hood (as can be seen in the cythonized c-file):
- a cython-array (即
cython。 view.array
)被构造,它包装了原始指针并可以通过缓冲区协议公开它。 - 此数组用于创建类型化的内存视图。
- a cython-array (i.e.
cython.view.array
) is constructed, which wraps the raw pointer and can expose it via buffer-protocol - this array is used for the creation of typed memory view.
您的理解 view.indirect_contiguous
的用途是正确的-是正是您想要的。但是,问题出在 view.array
上,它不能处理这种类型的数据布局。
Your understanding what view.indirect_contiguous
is used for is right - it is exactly what you desire. However, the problem is view.array
, which just cannot handle this type of data-layout.
view.indirect
和 view.indirect_contiguous
对应于 PyBUF_INDIRECT
,在协议缓冲区中,为此字段 suboffsets
必须包含一些有意义的值(例如,对于某些尺寸,> = 0
)。但是,如在源代码 view.array
根本没有此成员-根本无法表示复杂的内存布局!
view.indirect
and view.indirect_contiguous
correspond to PyBUF_INDIRECT
in protocol-buffer parlance and for this the field suboffsets
must contain some meaningful values (i.e >=0
for some dimensions). However, as can be see in the source-code view.array
doesn't have this member at all - there is no way it can represent the complex memory layout at all!
它将留在哪里?正如@chrisb和@DavidW在另一个问题中指出的那样,您将必须实现一个包装程序,该包装程序可以通过协议缓冲区公开您的数据结构。
Where does it leave us? As pointed out by @chrisb and @DavidW in your other question, you will have to implement a wrapper which can expose your data-structure via protocol-buffer.
有Python中使用间接内存布局的数据结构-最主要的是PIL数组。理解子偏移量
应该如何工作的一个很好的起点是这段文档:
There are data structures in Python, which use the indirect memory layout - most prominently the PIL-arrays. A good starting point to understand, how suboffsets
are supposed to work is this piece of documenation:
void *get_item_pointer(int ndim, void *buf, Py_ssize_t *strides,
Py_ssize_t *suboffsets, Py_ssize_t *indices) {
char *pointer = (char*)buf; // A
int i;
for (i = 0; i < ndim; i++) {
pointer += strides[i] * indices[i]; // B
if (suboffsets[i] >=0 ) {
pointer = *((char**)pointer) + suboffsets[i]; // C
}
}
return (void*)pointer; // D
}
在您的情况下,迈步
和偏移量
将是
-
步幅= [sizeof(int *),sizeof(int)]
(即通常x86_64 <上的
[8,4]
/ code>机器) -
offsets = [0,-1]
,即只有第一维是间接的。 / li>
strides=[sizeof(int*), sizeof(int)]
(i.e.[8,4]
on usualx86_64
machines)offsets=[0,-1]
, i.e. only the first dimension is indirect.
获取元素 [x,y]
的地址将如下所示:
Getting the address of element [x,y]
would then happen as follows:
- 在
A
行,指针
设置为buf
,我们假设BUF
。 - 第一维:
- 在
B
行,指针
变为BUF + x * 8
,并指向指向第x行的指针的位置。 - 因为
suboffsets [0]> = 0
,我们取消引用行C
中的指针,因此它显示了地址ROW_X
-第x行的开始。
- in the line
A
,pointer
is set tobuf
, let's assumeBUF
. - first dimension:
- in line
B
,pointer
becomesBUF+x*8
, and points to the location of the pointer to x-th row. - because
suboffsets[0]>=0
, we dereference the pointer in lineC
and thus it shows to addressROW_X
- the start of the x-th row.
- 在
B
行中,我们得到y
元素,使用步幅
,即pointer = ROW_X + 4 * y
- 第二维是直接的(由
suboffset [1] <0
表示),因此不需要取消引用。
- in line
B
we get the address of they
element usingstrides
, i.e.pointer=ROW_X+4*y
- second dimension is direct (signaled by
suboffset[1]<0
), so no dereferencing is needed.
FWIW,我实现了一个能够通过缓冲区协议导出
int **
和类似的内存布局: https://github.com/realead/indirect_buffer 。FWIW, I have implemented a library which is able to export
int**
and similar memory layouts via buffer protocol: https://github.com/realead/indirect_buffer.这篇关于Cython:了解具有indirect_contignuous内存布局的类型化memoryview的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
- in line
- 在