如何将布尔型掩码存储为Cython类的属性? [英] How to store a boolean mask as an attribute of a Cython class?
问题描述
我无法将布尔型掩码保存为Cython类的属性.在实际代码中,我需要此掩码以更有效地执行任务.这里是一个示例代码:
I failed to save a boolean mask as an attribute of a Cython class. In the real code I need this mask to perform tasks more efficiently. Here it follows a sample code:
import numpy as np
cimport numpy as np
cdef class MyClass():
cdef public np.uint8_t[:] mask # uint8 has the same data structure of a boolean array
cdef public np.float64_t[:] data
def __init__(self, size):
self.data = np.random.rand(size).astype(np.float64)
self.mask = np.zeros(size, np.uint8)
script.py
import numpy as np
import pyximport
pyximport.install(setup_args={'include_dirs': np.get_include()})
from core import MyClass
mc = MyClass(1000000)
mc.mask = np.asarray(mc.data) > 0.5
错误
当我运行 script.py
时,它可以成功编译Cython,但会引发错误:
Error
When I run script.py
it successfully compiles Cython, but throws the error:
Traceback (most recent call last):
File "script.py", line 8, in <module>
mc.mask = np.asarray(mc.data) > 0.5
File "core.pyx", line 6, in core.MyClass.mask.__set__
cdef public np.uint8_t[:] mask
ValueError: Does not understand character buffer dtype format string ('?')
解决方法
我当前的解决方法是使用 cast = True
将掩码传递给我需要的所有功能,例如:
Workaround
My current workaround is to pass the mask to all functions where I need, using cast=True
, for example:
cpdef func(MyClass mc, np.ndarray[np.uint8_t, ndim=1, cast=True] mask):
return np.asarray(mc.data)[mask]
问题
关于如何将面罩存储在Cython类中是否有任何想法?
Question
Are there any ideas out there on how the mask could be stored in the Cython class?
推荐答案
所以我不认为memoryviews实际上支持布尔索引.因此,要对数组进行索引,您总是必须要做的
So I don't believe memoryviews actually support boolean indexing anyway. Therefore to index the array you're always going to have to do
np.asarray(mc.data)[mask]
# or
mc.data.base[mask] # if you're sure it's always a view of something that supports boolean indexing)
我不认为@ead提到的Cython更新会改变这种情况.我怀疑这是因为它可能很容易进行赋值( mc.data [mask] = x
),但是并不清楚 mc应该返回哪种类型.data [mask]
-它不是内存视图.
I don't think this will change with the Cython update that @ead mentions. I suspect the reason for this is that it's probably fairly easy to do assignment (mc.data[mask] = x
), but it isn't obvious what type should be returned by mc.data[mask]
- it isn't a memoryview.
因此,您所做的任何事情都会涉及一些混乱的代码.
Therefore, whatever you do is going to involve some messy code.
对于分配给memoryview的部分可以通过
For the part of the Assignment to the memoryview can be done with
mc.mask = (np.asarray(mc.data) > 0.5).view(np.uint8)
并使用以下命令将其返回到Numpy bool数组:
and returning it to a Numpy bool array with:
np.asarray(mc.mask).view(np.bool)
两者均不涉及复制.
如果是我设计的,我将使内存视图保持非公开状态(仅供Cython使用),并具有普通对象属性,这些属性仅持有Python接口的底层Numpy数组.您可以使用 property
使它们保持同步(并进行转换):
If it were me designing this I'd keep the memoryviews non-public (for Cython-only use) and have normal object attributes that just hold the underlying Numpy arrays for the Python interface. You could use property
to keep them in-sync (and do the casting):
cdef class MyClass:
cdef np.uint8_t[:] mask_mview
cdef object _mask
@property
def mask(self):
return np.asarray(self._mask).view(np.bool)
@mask.setter
def mask(self, value):
self._mask = value
self.mask_view = value.view(np.uint8)
# and the same for data
这样一来,您就可以使用一个memoryview来完成Memoryview擅长的事情(在Cython中快速逐个元素迭代),访问Python的纯Numpy数组,并且两者保持同步(至少通过Python界面).
That way you have a memoryview to use for things that memoryviews are good at (iterating quickly element-by-element in Cython), access to the plain Numpy array for Python, and the two are held in sync (at least by the Python interface).
这篇关于如何将布尔型掩码存储为Cython类的属性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!