使用Cython将从C函数创建的2D数组返回到Python [英] return 2D array created from a C function into Python using Cython

查看:123
本文介绍了使用Cython将从C函数创建的2D数组返回到Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用由 python 中的 c 函数创建的2D数组。我问今天之前如何做,@ Abhijit Pritam建议的一种方法是使用结构。我实现了它,并且确实起作用。

I want to use a 2D array created by a c function in python. I asked how to do this before today and one approach suggested by @Abhijit Pritam was to use structs. I implemented it and it does work.

c代码:

typedef struct {
  int arr[3][5];
} Array;

Array make_array_struct() {
  Array my_array;
  int count = 0;
  for (int i = 0; i < 3; i++)
    for (int j = 0; j  < 5; j++)
      my_array.arr[i][j] = ++count;
  return my_array;
}

在python中我有这个:

in python I have this:

cdef extern from "numpy_fun.h":
    ctypedef struct Array:
        int[3][5] arr
    cdef Array make_array_struct()

def make_array():
    cdef Array arr = make_array_struct()
    return arr

my_arr = make_array()
my_arr['arr']
[[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]]

但是有人建议这不是解决问题的最佳方法,因为有可能使python控制数据。我正在尝试实现此功能,但到目前为止还没有做到。这就是我所拥有的。

However it was suggested that this was not the best approach to the problem because it's possible to make python have control over the data. I'm trying to implement this but I haven't been able to that so far. This is what I have.

c代码:

int **make_array_ptr() {
  int **my_array = (int **)malloc(3 * sizeof(int *));
  my_array[0] = calloc(3 * 5, sizeof(int));
  for (int i = 1; i < 3; i++)
    my_array[i] = my_array[0] + i * 5;
  int count = 0;
  for (int i = 0; i < 3; i++)
    for (int j = 0; j < 5; j++)
      my_array[i][j] = ++count;
  return my_array;
}

python:

import numpy as np
cimport numpy as np

np.import_array()

ctypedef np.int32_t DTYPE_t

cdef extern from "numpy/arrayobject.h":
    void PyArray_ENABLEFLAGS(np.ndarray arr, int flags)

cdef extern from "numpy_fun.h":
    cdef int **make_array_ptr()

def make_array():
    cdef int[::1] dims = np.array([3, 5], dtype=np.int32)
    cdef DTYPE_t **data = <DTYPE_t **>make_array_ptr()
    cdef np.ndarray[DTYPE_t, ndim=2] my_array = np.PyArray_SimpleNewFromData(2, &dims[0], np.NPY_INT32, data)
    PyArray_ENABLEFLAGS(my_array, np.NPY_OWNDATA)
    return my_array

我正在关注 Force NumPy ndarray取得其在Cython
中的内存所有权 需要做。就我而言,这是不同的,因为我需要2D数组,所以我可能不得不做些不同的事情,例如,该函数希望 data 是指向int的指针,并且我给它一个指向int的指针。
我必须怎么做才能使用这种方法?

I was following Force NumPy ndarray to take ownership of its memory in Cython which seems to be what I need to do. In my case is it's different because I need 2D array so I'll likely have to do things a bit differently because for example the function expects data to be a pointer to int and I gave it a pointer to pointer to int. What do I have to do to use this approach?

推荐答案

的问题struct 方法是:


  1. 只要要固定大小的数组,它就会中断,没有真正的解决方法。

  1. It breaks as soon as you want anything but a fixed size of array, with no real way of fixing it.

它依赖于Cython从结构到字典的隐式转换。 Cython会将数据复制到Python列表中,这并不是很有效。对于您在此处的小型阵列而言,这不是问题,但对于大型阵列而言却是愚蠢的。

It relies on Cython's implicit conversion from structs to dicts. Cython copies the data to a Python list, which isn't terribly efficient. This isn't an issue with the small arrays you have here, but it's silly for larger arrays.






我也不真正推荐2D数组作为指针到指针。 numpy(以及大多数其他明智的数组库)实现2D数组的方式是存储1D数组和2D数组的形状,并仅使用形状确定要访问的索引。这样往往会更有效(更快的查找,更快的分配)并且也更易于使用(更少的分配/重新分配来跟踪)。


I also don't really recommend 2D arrays as pointers-to-pointers. The way numpy (and most other sensible array libraries) implement 2D arrays is to store a 1D array and the shape of the 2D array, and just use the shape to work out what index to access. This tends to be more efficient (faster lookups, faster allocation) and also easier to use (less allocation/deallocation to keep track of).

为此,请更改C代码为:

To do this change the C code to:

int32_t *make_array_ptr() {
  int32_t *my_array = calloc(3 * 5, sizeof(int32_t));
  int count = 0;
  for (int i = 0; i < 3; i++)
    for (int j = 0; j < 5; j++)
      my_array[j+i*5] = ++count;
  return my_array;
}

我已删除了您立即覆盖的第一个循环。我还更改了 int32_t 的类型,因为您稍后似乎在Cython代码中依靠它。

I've deleted the first loop that you immediately overwrite. I've also changed the type of int32_t since you seem to rely on this in your Cython code later.

Cython代码与您使用的代码非常接近:

The Cython code is then very close to what you were using:

def make_array():
    cdef np.intp_t dims[2] 
    dims[0]=3; dims[1] = 5
    cdef np.int32_t *data = make_array_ptr()
    cdef np.ndarray[np.int32_t, ndim=2] my_array = np.PyArray_SimpleNewFromData(2, &dims[0], np.NPY_INT32, data)
    PyArray_ENABLEFLAGS(my_array, np.NPY_OWNDATA)
    return my_array

主要变化是我删除了一些强制类型转换,还只是将暗点分配为静态数组(这似乎比memoryviews简单)

The main changes are that I've removed some casts and also just allocated dims as a static array (which seemed simpler than memoryviews)

我认为让numpy处理指针到指针数组不是特别容易。通过实现Python缓冲区接口,这可能是可行的,但这似乎是很多工作,而且可能并不容易。

I don't think it's particularly easy allow numpy to handle a pointer-to-pointer array. It might be possible by implementing the Python buffer interface but that that seems like a lot of work and may not be easy.

这篇关于使用Cython将从C函数创建的2D数组返回到Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆