通过 pybind11 返回 numpy 数组 [英] returning numpy arrays via pybind11
问题描述
我有一个 C++ 函数计算一个大张量,我想通过 pybind11<作为 NumPy 数组返回给 Python/a>.
从 pybind11 的文档来看,似乎使用 STL unique_ptr 是可取的.在下面的示例中,注释掉的版本有效,而给定的版本可以编译但在运行时失败(无法将函数返回值转换为 Python 类型!").
为什么 smartpointer 版本失败?创建和返回 NumPy 数组的规范方法是什么?
PS:由于程序结构和数组的大小,最好不要复制内存而是从给定的指针创建数组.内存所有权应由 Python 取得.
typedef typename py::array_t<double, py::array::c_style |py::array::forcecast>py_cdarray_t;//py_cd_array_t _test()std::unique_ptr<py_cdarray_t>_测试(){双*内存=新双[3];内存[0] = 11;内存[1] = 12;内存[2] = 13;py::buffer_info bufinfo (memory,//指向内存缓冲区的指针sizeof(double),//底层标量类型的大小py::format_descriptor<double>::format(),//python struct-style 格式描述符1,//维数{ 3 },//缓冲区尺寸{ sizeof(double) }//每个索引的步幅(以字节为单位));//返回py_cdarray_t(bufinfo);返回 std::unique_ptr<py_cdarray_t>( new py_cdarray_t(bufinfo) );}
一些评论(然后是一个可行的实现).
- pybind11 围绕 Python 类型的 C++ 对象包装器(例如
pybind11::object
、pybind11::list
,以及在本例中的pybind11::array_t<T>
) 实际上只是底层 Python 对象指针的包装器.在这方面,已经承担了共享指针包装器的角色,因此将其包装在unique_ptr
中没有意义:返回py::array_t<T>
对象直接已经本质上只是返回一个美化的指针. pybind11::array_t
可以直接从数据指针构造,所以你可以跳过py::buffer_info
中间步骤,直接给出形状和步幅pybind11::array_t
构造函数.以这种方式构造的 numpy 数组不会拥有自己的数据,它只会引用它(即,numpyowndata
标志将设置为 false).- 内存所有权可以与 Python 对象的生命周期相关联,但您仍然需要正确执行解除分配.Pybind11 提供了一个
py::capsule
类来帮助你做到这一点.您想要做的是通过将 numpy 数组指定为array_t
的base
参数,使 numpy 数组依赖于该胶囊作为其父类.这将使 numpy 数组引用它,只要数组本身处于活动状态,它就会保持活动状态,并在不再被引用时调用清理函数. - 旧版本(2.2 之前)中的
c_style
标志仅对新数组有影响,即不传递值指针时.这在 2.2 版本中得到修复,如果您只指定形状而不指定步幅,也会影响自动步幅.如果您自己直接指定步幅(就像我在下面的示例中所做的那样),它根本没有任何影响.
因此,将各个部分放在一起,此代码是一个完整的 pybind11 模块,它演示了如何完成您要查找的内容(并包含一些 C++ 输出以证明确实可以正常工作):
#include #include #include 命名空间 py = pybind11;PYBIND11_PLUGIN(numpywrap){py::module m("numpywrap");m.def("f", []() {//分配并初始化一些数据;把这么大//我们可以看到对进程内存使用的影响:constexpr size_t size = 100*1000*1000;double *foo = new double[size];for (size_t i = 0; i < size; i++) {foo[i] = (double) i;}//创建一个 Python 对象,该对象将释放已分配的//销毁时的内存:py::capsule free_when_done(foo, [](void *f) {double *foo = reinterpret_cast(f);std::cerr <<"元素 [0] = " <<foo[0] <<"\n";std::cerr <<释放内存@" <<f<<"\n";删除[] foo;});返回 py::array_t({100, 1000, 1000},//形状{1000*1000*8, 1000*8, 8},//C 风格的连续双步步长foo,//数据指针free_when_done);//numpy 数组引用这个父对象});返回 m.ptr();}
编译并从 Python 调用它表明它可以工作:
<预><代码>>>>导入 numpywrap>>>z = numpywrap.f()>>># python 进程现在占用了 800MB 多一点的内存>>>z[1,1,1]1001001.0>>>z[0,0,100]100.0>>>z[99,999,999]99999999.0>>>z[0,0,0] = 3.141592>>>德尔兹元素 [0] = 3.14159释放内存@ 0x7fd769f12010>>># python 进程内存大小已经回落I have a C++ function computing a large tensor which I would like to return to Python as a NumPy array via pybind11.
From the documentation of pybind11, it seems like using STL unique_ptr is desirable. In the following example, the commented out version works, whereas the given one compiles but fails at runtime ("Unable to convert function return value to a Python type!").
Why is the smartpointer version failing? What is the canonical way to create and return a NumPy array?
PS: Due to program structure and size of the array, it is desirable to not copy memory but create the array from a given pointer. Memory ownership should be taken by Python.
typedef typename py::array_t<double, py::array::c_style | py::array::forcecast> py_cdarray_t;
// py_cd_array_t _test()
std::unique_ptr<py_cdarray_t> _test()
{
double * memory = new double[3]; memory[0] = 11; memory[1] = 12; memory[2] = 13;
py::buffer_info bufinfo (
memory, // pointer to memory buffer
sizeof(double), // size of underlying scalar type
py::format_descriptor<double>::format(), // python struct-style format descriptor
1, // number of dimensions
{ 3 }, // buffer dimensions
{ sizeof(double) } // strides (in bytes) for each index
);
//return py_cdarray_t(bufinfo);
return std::unique_ptr<py_cdarray_t>( new py_cdarray_t(bufinfo) );
}
A few comments (then a working implementation).
- pybind11's C++ object wrappers around Python types (like
pybind11::object
,pybind11::list
, and, in this case,pybind11::array_t<T>
) are really just wrappers around an underlying Python object pointer. In this respect there are already taking on the role of a shared pointer wrapper, and so there's no point in wrapping that in aunique_ptr
: returning thepy::array_t<T>
object directly is already essentially just returning a glorified pointer. pybind11::array_t
can be constructed directly from a data pointer, so you can skip thepy::buffer_info
intermediate step and just give the shape and strides directly to thepybind11::array_t
constructor. A numpy array constructed this way won't own its own data, it'll just reference it (that is, the numpyowndata
flag will be set to false).- Memory ownership can be tied to the life of a Python object, but you're still on the hook for doing the deallocation properly. Pybind11 provides a
py::capsule
class to help you do exactly this. What you want to do is make the numpy array depend on this capsule as its parent class by specifying it as thebase
argument toarray_t
. That will make the numpy array reference it, keeping it alive as long as the array itself is alive, and invoke the cleanup function when it is no longer referenced. - The
c_style
flag in the older (pre-2.2) releases only had an effect on new arrays, i.e. when not passing a value pointer. That was fixed in the 2.2 release to also affect the automatic strides if you specify only shapes but not strides. It has no effect at all if you specify the strides directly yourself (as I do in the example below).
So, putting the pieces together, this code is a complete pybind11 module that demonstrates how you can accomplish what you're looking for (and includes some C++ output to demonstrate that is indeed working correctly):
#include <iostream>
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>
namespace py = pybind11;
PYBIND11_PLUGIN(numpywrap) {
py::module m("numpywrap");
m.def("f", []() {
// Allocate and initialize some data; make this big so
// we can see the impact on the process memory use:
constexpr size_t size = 100*1000*1000;
double *foo = new double[size];
for (size_t i = 0; i < size; i++) {
foo[i] = (double) i;
}
// Create a Python object that will free the allocated
// memory when destroyed:
py::capsule free_when_done(foo, [](void *f) {
double *foo = reinterpret_cast<double *>(f);
std::cerr << "Element [0] = " << foo[0] << "\n";
std::cerr << "freeing memory @ " << f << "\n";
delete[] foo;
});
return py::array_t<double>(
{100, 1000, 1000}, // shape
{1000*1000*8, 1000*8, 8}, // C-style contiguous strides for double
foo, // the data pointer
free_when_done); // numpy array references this parent
});
return m.ptr();
}
Compiling that and invoking it from Python shows it working:
>>> import numpywrap
>>> z = numpywrap.f()
>>> # the python process is now taking up a bit more than 800MB memory
>>> z[1,1,1]
1001001.0
>>> z[0,0,100]
100.0
>>> z[99,999,999]
99999999.0
>>> z[0,0,0] = 3.141592
>>> del z
Element [0] = 3.14159
freeing memory @ 0x7fd769f12010
>>> # python process memory size has dropped back down
这篇关于通过 pybind11 返回 numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!