带C指针的Pickle Cython类 [英] Pickle Cython Class with C pointers

查看:123
本文介绍了带C指针的Pickle Cython类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为包含C指针的cython类编写一个__reduce__()方法,但是到目前为止,关于执行此操作的最佳方法的信息很少.关于使用numpy数组作为成员数据时如何正确编写__reduce__()方法的大量示例.我想远离Numpy数组,因为它们似乎总是存储为python对象,并且需要对python API的调用.我来自C语言,因此我非常乐于使用对malloc()free()的调用来手动处理内存,并试图将python交互作用保持在最低限度.

I am trying to write a __reduce__() method for a cython class that contains C pointers but have so far found very little information on the best way to go about doing this. There are tons of examples around for how to properly write a __reduce__() method when using numpy arrays as member data. I'd like to stay away from Numpy arrays as they seem to always be stored as python objects and require calls to and from the python API. I come from a C background so I am very comfortable working with memory manually using calls to malloc() and free() and am trying to keep python interaction to an absolute minimum.

但是我遇到了一个问题.我需要在最终将要使用的Python脚本中使用与正在创建的类上的copy.deepcopy()等效的东西.我发现执行此操作的唯一好方法是通过实现__reduce__()方法来为该类实现pickle协议.对于大多数原语或python对象而言,这是微不足道的.但是,我对于如何为动态分配的C数组执行此操作完全不知所措.显然,我无法返回指针本身,因为在对象重建时基础内存将消失,那么最好的方法是什么?我确定这将需要同时修改__reduce__()方法和一个或两个__init__()方法.

However I have run into a problem. I have a need to use something equivalent to copy.deepcopy() on the class I am creating, from the Python script where it will ultimately be used. I have found that the only good way to do this is to implement the pickle protocol for the class by implementing a __reduce__() method. This is trivial with most primitives or python objects. However I am at an absolute loss for how to go about doing this for dynamically allocated C arrays. Obviously I can't return the pointer itself as the underlying memory will have disappeared by the time the object is reconstructed, so what's the best way to do this? I'm sure this will require modification of both the __reduce__() method as well as one or both of the __init__() methods.

我已经阅读了有关酸洗扩展类型的python文档 this问题.

I have read the python documentation on pickling extension types found here as well as just about every other question of stack overflow about picking cython classes such as this question.

我的课程的精简版看起来像这样:

A condensed version of my class looks something like this:

cdef class Bin:
    cdef int* job_ids
    cdef int* jobs
    cdef int primitive_data

    def __cinit__(self):
        self.job_ids = <int*>malloc(40 * sizeof(int))
        self.jobs = <int*>malloc(40 * sizeof(int))

    def __init__(self, int val):
        self.primitive_data = val

    def __dealloc__(self):
        free(job_ids)
        free(jobs)

    def __reduce__(self):
        return (self.__class__, (self.primitive_data))

推荐答案

一种方法是将数组中的数据序列化为Python bytes数组. __reduce__方法首先调用get_data方法,该方法将数据指针转换为<char*>,然后转换为<bytes>(如果您尝试直接去那里,Cython不知道该怎么做). __reduce__返回此对象,以及对rebuild函数(模块级函数,而不是方法!)的引用,该引用可用于使用set_data方法重新创建实例.如果您需要传递多个数组(如您的示例中所示),则只需接受rebuild的更多参数并扩展__reduce__返回的元组.

One approach is to serialise the data in your array into a Python bytes array. The __reduce__ method first calls the get_data method which casts the data pointer to <char*> then to <bytes> (if you try to go there directly Cython doesn't know how to do it). __reduce__ returns this object, along with a reference to the rebuild function (a module-level function, not a method!) which can be use to recreate the instance using the set_data method. If you need to pass more than one array, as in your example, you just need to accept more arguments to rebuild and extend the tuple returned by __reduce__.

我对此没有做太多测试,但似乎可行.如果传递格式错误的数据,它可能会爆炸.

I haven't done much testing on this but it seems to work. It would probably explode if you passed it malformed data.

from cpython.mem cimport PyMem_Malloc, PyMem_Realloc, PyMem_Free
from libc.string cimport memcpy

cdef int length = 40

cdef class MyClass:
    cdef long *data

    def __cinit__(self):
        self.data = <long*>PyMem_Malloc(sizeof(long)*length)
        if not self.data:
            raise MemoryError()

    cdef bytes get_data(self):
        return <bytes>(<char *>self.data)[:sizeof(long)*length]

    cdef void set_data(self, bytes data):
        memcpy(self.data, <char*>data, sizeof(long)*length)

    def set_values(self):
        # assign some dummy data to the array 0..length
        for n in range(0, length):
            self.data[n] = n

    def get(self, i):
        # get the ith value of the data
        return self.data[i]

    def __reduce__(self):
        data = self.get_data()
        return (rebuild, (data,))

    def __dealloc__(self):
        PyMem_Free(self.data)

cpdef object rebuild(bytes data):
    c = MyClass()
    c.set_data(data)
    return c

用法示例(假设MyClass在hello.pyx中):

Example usage (assuming MyClass is in hello.pyx):

import hello
import pickle

c1 = hello.MyClass()
c1.set_values()
print('c1', c1)
print('fifth item', c1.get(5))

d = pickle.dumps(c1)
del(c1)  # delete the original object

c2 = pickle.loads(d)
print('c2', c2)
print('fifth item', c2.get(5))

这篇关于带C指针的Pickle Cython类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆