如何将布尔型掩码存储为Cython类的属性? [英] How to store a boolean mask as an attribute of a Cython class?

查看:70
本文介绍了如何将布尔型掩码存储为Cython类的属性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我无法将布尔型掩码保存为Cython类的属性.在实际代码中,我需要此掩码以更有效地执行任务.这里是一个示例代码:

I failed to save a boolean mask as an attribute of a Cython class. In the real code I need this mask to perform tasks more efficiently. Here it follows a sample code:

import numpy as np
cimport numpy as np

cdef class MyClass():
    cdef public np.uint8_t[:] mask # uint8 has the same data structure of a boolean array
    cdef public np.float64_t[:] data

    def __init__(self, size):
        self.data = np.random.rand(size).astype(np.float64)
        self.mask = np.zeros(size, np.uint8)

script.py

import numpy as np
import pyximport
pyximport.install(setup_args={'include_dirs': np.get_include()})

from core import MyClass

mc = MyClass(1000000)
mc.mask = np.asarray(mc.data) > 0.5 

错误

当我运行 script.py 时,它可以成功编译Cython,但会引发错误:

Error

When I run script.py it successfully compiles Cython, but throws the error:

Traceback (most recent call last):
  File "script.py", line 8, in <module>
    mc.mask = np.asarray(mc.data) > 0.5
  File "core.pyx", line 6, in core.MyClass.mask.__set__
    cdef public np.uint8_t[:] mask
ValueError: Does not understand character buffer dtype format string ('?')

解决方法

我当前的解决方法是使用 cast = True 将掩码传递给我需要的所有功能,例如:

Workaround

My current workaround is to pass the mask to all functions where I need, using cast=True, for example:

cpdef func(MyClass mc, np.ndarray[np.uint8_t, ndim=1, cast=True] mask):
    return np.asarray(mc.data)[mask]

问题

关于如何将面罩存储在Cython类中是否有任何想法?

Question

Are there any ideas out there on how the mask could be stored in the Cython class?

推荐答案

所以我不认为memoryviews实际上支持布尔索引.因此,要对数组进行索引,您总是必须要做的

So I don't believe memoryviews actually support boolean indexing anyway. Therefore to index the array you're always going to have to do

np.asarray(mc.data)[mask]
# or
mc.data.base[mask] # if you're sure it's always a view of something that supports boolean indexing)

我不认为@ead提到的Cython更新会改变这种情况.我怀疑这是因为它可能很容易进行赋值( mc.data [mask] = x ),但是并不清楚 mc应该返回哪种类型.data [mask] -它不是内存视图.

I don't think this will change with the Cython update that @ead mentions. I suspect the reason for this is that it's probably fairly easy to do assignment (mc.data[mask] = x), but it isn't obvious what type should be returned by mc.data[mask] - it isn't a memoryview.

因此,您所做的任何事情都会涉及一些混乱的代码.

Therefore, whatever you do is going to involve some messy code.

对于分配给memoryview的部分可以通过

For the part of the Assignment to the memoryview can be done with

mc.mask = (np.asarray(mc.data) > 0.5).view(np.uint8)

并使用以下命令将其返回到Numpy bool数组:

and returning it to a Numpy bool array with:

np.asarray(mc.mask).view(np.bool)

两者均不涉及复制.

如果是我设计的,我将使内存视图保持非公开状态(仅供Cython使用),并具有普通对象属性,这些属性仅持有Python接口的底层Numpy数组.您可以使用 property 使它们保持同步(并进行转换):

If it were me designing this I'd keep the memoryviews non-public (for Cython-only use) and have normal object attributes that just hold the underlying Numpy arrays for the Python interface. You could use property to keep them in-sync (and do the casting):

cdef class MyClass:
    cdef np.uint8_t[:] mask_mview
    cdef object _mask

    @property
    def mask(self):
        return np.asarray(self._mask).view(np.bool)

    @mask.setter
    def mask(self, value):
        self._mask = value
        self.mask_view = value.view(np.uint8)

    # and the same for data

这样一来,您就可以使用一个memoryview来完成Memoryview擅长的事情(在Cython中快速逐个元素迭代),访问Python的纯Numpy数组,并且两者保持同步(至少通过Python界面).

That way you have a memoryview to use for things that memoryviews are good at (iterating quickly element-by-element in Cython), access to the plain Numpy array for Python, and the two are held in sync (at least by the Python interface).

这篇关于如何将布尔型掩码存储为Cython类的属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆