通过 pybind11 返回 numpy 数组 [英] returning numpy arrays via pybind11

查看:84
本文介绍了通过 pybind11 返回 numpy 数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 C++ 函数计算一个大张量,我想通过 pybind11<作为 NumPy 数组返回给 Python/a>.

从 pybind11 的文档来看,似乎使用 STL unique_ptr 是可取的.在下面的示例中,注释掉的版本有效,而给定的版本可以编译但在运行时失败(无法将函数返回值转换为 Python 类型!").

为什么 smartpointer 版本失败?创建和返回 NumPy 数组的规范方法是什么?

PS:由于程序结构和数组的大小,最好不要复制内存而是从给定的指针创建数组.内存所有权应由 Python 取得.

typedef typename py::array_t<double, py::array::c_style |py::array::forcecast>py_cdarray_t;//py_cd_array_t _test()std::unique_ptr<py_cdarray_t>_测试(){双*内存=新双[3];内存[0] = 11;内存[1] = 12;内存[2] = 13;py::buffer_info bufinfo (memory,//指向内存缓冲区的指针sizeof(double),//底层标量类型的大小py::format_descriptor<double>::format(),//python struct-style 格式描述符1,//维数{ 3 },//缓冲区尺寸{ sizeof(double) }//每个索引的步幅(以字节为单位));//返回py_cdarray_t(bufinfo);返回 std::unique_ptr<py_cdarray_t>( new py_cdarray_t(bufinfo) );}

解决方案

一些评论(然后是一个可行的实现).

  • pybind11 围绕 Python 类型的 C++ 对象包装器(例如 pybind11::objectpybind11::list,以及在本例中的 pybind11::array_t<T>) 实际上只是底层 Python 对象指针的包装器.在这方面,已经承担了共享指针包装器的角色,因此将其包装在 unique_ptr 中没有意义:返回 py::array_t<T> 对象直接已经本质上只是返回一个美化的指针.
  • pybind11::array_t 可以直接从数据指针构造,所以你可以跳过 py::buffer_info 中间步骤,直接给出形状和步幅pybind11::array_t 构造函数.以这种方式构造的 numpy 数组不会拥有自己的数据,它只会引用它(即,numpy owndata 标志将设置为 false).
  • 内存所有权可以与 Python 对象的生命周期相关联,但您仍然需要正确执行解除分配.Pybind11 提供了一个 py::capsule 类来帮助你做到这一点.您想要做的是通过将 numpy 数组指定为 array_tbase 参数,使 numpy 数组依赖于该胶囊作为其父类.这将使 numpy 数组引用它,只要数组本身处于活动状态,它就会保持活动状态,并在不再被引用时调用清理函数.
  • 旧版本(2.2 之前)中的 c_style 标志仅对新数组有影响,即不传递值指针时.这在 2.2 版本中得到修复,如果您只指定形状而不指定步幅,也会影响自动步幅.如果您自己直接指定步幅(就像我在下面的示例中所做的那样),它根本没有任何影响.

因此,将各个部分放在一起,此代码是一个完整的 pybind11 模块,它演示了如何完成您要查找的内容(并包含一些 C++ 输出以证明确实可以正常工作):

#include #include #include 命名空间 py = pybind11;PYBIND11_PLUGIN(numpywrap){py::module m("numpywrap");m.def("f", []() {//分配并初始化一些数据;把这么大//我们可以看到对进程内存使用的影响:constexpr size_t size = 100*1000*1000;double *foo = new double[size];for (size_t i = 0; i < size; i++) {foo[i] = (double) i;}//创建一个 Python 对象,该对象将释放已分配的//销毁时的内存:py::capsule free_when_done(foo, [](void *f) {double *foo = reinterpret_cast(f);std::cerr <<"元素 [0] = " <<foo[0] <<"\n";std::cerr <<释放内存@" <<f<<"\n";删除[] foo;});返回 py::array_t({100, 1000, 1000},//形状{1000*1000*8, 1000*8, 8},//C 风格的连续双步步长foo,//数据指针free_when_done);//numpy 数组引用这个父对象});返回 m.ptr();}

编译并从 Python 调用它表明它可以工作:

<预><代码>>>>导入 numpywrap>>>z = numpywrap.f()>>># python 进程现在占用了 800MB 多一点的内存>>>z[1,1,1]1001001.0>>>z[0,0,100]100.0>>>z[99,999,999]99999999.0>>>z[0,0,0] = 3.141592>>>德尔兹元素 [0] = 3.14159释放内存@ 0x7fd769f12010>>># python 进程内存大小已经回落

I have a C++ function computing a large tensor which I would like to return to Python as a NumPy array via pybind11.

From the documentation of pybind11, it seems like using STL unique_ptr is desirable. In the following example, the commented out version works, whereas the given one compiles but fails at runtime ("Unable to convert function return value to a Python type!").

Why is the smartpointer version failing? What is the canonical way to create and return a NumPy array?

PS: Due to program structure and size of the array, it is desirable to not copy memory but create the array from a given pointer. Memory ownership should be taken by Python.

typedef typename py::array_t<double, py::array::c_style | py::array::forcecast> py_cdarray_t;

// py_cd_array_t _test()
std::unique_ptr<py_cdarray_t> _test()
{
    double * memory = new double[3]; memory[0] = 11; memory[1] = 12; memory[2] = 13;
    py::buffer_info bufinfo (
        memory,                                   // pointer to memory buffer
        sizeof(double),                           // size of underlying scalar type
        py::format_descriptor<double>::format(),  // python struct-style format descriptor
        1,                                        // number of dimensions
        { 3 },                                    // buffer dimensions
        { sizeof(double) }                        // strides (in bytes) for each index
    );

    //return py_cdarray_t(bufinfo);
    return std::unique_ptr<py_cdarray_t>( new py_cdarray_t(bufinfo) );
}

解决方案

A few comments (then a working implementation).

  • pybind11's C++ object wrappers around Python types (like pybind11::object, pybind11::list, and, in this case, pybind11::array_t<T>) are really just wrappers around an underlying Python object pointer. In this respect there are already taking on the role of a shared pointer wrapper, and so there's no point in wrapping that in a unique_ptr: returning the py::array_t<T> object directly is already essentially just returning a glorified pointer.
  • pybind11::array_t can be constructed directly from a data pointer, so you can skip the py::buffer_info intermediate step and just give the shape and strides directly to the pybind11::array_t constructor. A numpy array constructed this way won't own its own data, it'll just reference it (that is, the numpy owndata flag will be set to false).
  • Memory ownership can be tied to the life of a Python object, but you're still on the hook for doing the deallocation properly. Pybind11 provides a py::capsule class to help you do exactly this. What you want to do is make the numpy array depend on this capsule as its parent class by specifying it as the base argument to array_t. That will make the numpy array reference it, keeping it alive as long as the array itself is alive, and invoke the cleanup function when it is no longer referenced.
  • The c_style flag in the older (pre-2.2) releases only had an effect on new arrays, i.e. when not passing a value pointer. That was fixed in the 2.2 release to also affect the automatic strides if you specify only shapes but not strides. It has no effect at all if you specify the strides directly yourself (as I do in the example below).

So, putting the pieces together, this code is a complete pybind11 module that demonstrates how you can accomplish what you're looking for (and includes some C++ output to demonstrate that is indeed working correctly):

#include <iostream>
#include <pybind11/pybind11.h>
#include <pybind11/numpy.h>

namespace py = pybind11;

PYBIND11_PLUGIN(numpywrap) {
    py::module m("numpywrap");
    m.def("f", []() {
        // Allocate and initialize some data; make this big so
        // we can see the impact on the process memory use:
        constexpr size_t size = 100*1000*1000;
        double *foo = new double[size];
        for (size_t i = 0; i < size; i++) {
            foo[i] = (double) i;
        }

        // Create a Python object that will free the allocated
        // memory when destroyed:
        py::capsule free_when_done(foo, [](void *f) {
            double *foo = reinterpret_cast<double *>(f);
            std::cerr << "Element [0] = " << foo[0] << "\n";
            std::cerr << "freeing memory @ " << f << "\n";
            delete[] foo;
        });

        return py::array_t<double>(
            {100, 1000, 1000}, // shape
            {1000*1000*8, 1000*8, 8}, // C-style contiguous strides for double
            foo, // the data pointer
            free_when_done); // numpy array references this parent
    });
    return m.ptr();
}

Compiling that and invoking it from Python shows it working:

>>> import numpywrap
>>> z = numpywrap.f()
>>> # the python process is now taking up a bit more than 800MB memory
>>> z[1,1,1]
1001001.0
>>> z[0,0,100]
100.0
>>> z[99,999,999]
99999999.0
>>> z[0,0,0] = 3.141592
>>> del z
Element [0] = 3.14159
freeing memory @ 0x7fd769f12010
>>> # python process memory size has dropped back down

这篇关于通过 pybind11 返回 numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆