防止numpy创建多维数组 [英] Prevent numpy from creating a multidimensional array
问题描述
NumPy 在创建数组时非常有用.如果 numpy.array
的第一个参数有一个 __getitem__
和 __len__
方法,那么使用它们的基础是它可能是一个有效序列.
不幸的是,我想创建一个包含 dtype=object
的数组,而 NumPy 没有帮助".
将类分解为一个最小的示例:
将 numpy 导入为 np类测试(对象):def __init__(self, iterable):self.data = 可迭代def __getitem__(self, idx):返回 self.data[idx]def __len__(self):返回 len(self.data)def __repr__(self):返回 '{}({})'.format(self.__class__.__name__, self.data)
如果iterables"有不同的长度,一切都很好,我得到了我想要的结果:
<预><代码>>>>np.array([Test([1,2,3]), Test([3,2])], dtype=object)数组([Test([1, 2, 3]), Test([3, 2])], dtype=object)但是如果它们恰好具有相同的长度,NumPy 会创建一个多维数组:
<预><代码>>>>np.array([Test([1,2,3]), Test([3,2,1])], dtype=object)数组([[1, 2, 3],[3, 2, 1]], dtype=object)不幸的是,只有一个 ndmin
参数,所以我想知道是否有办法强制执行 ndmax
或以某种方式阻止 NumPy 将自定义类解释为另一个维度(不删除 __len__
或 __getitem__
)?
解决方法当然是创建一个所需形状的数组,然后复制数据:
在[19]中:lst = [Test([1, 2, 3]), Test([3, 2, 1])]在 [20]: arr = np.empty(len(lst), dtype=object)在 [21] 中:arr[:] = lst[:]在 [22]: arrOut[22]: array([Test([1, 2, 3]), Test([3, 2, 1])], dtype=object)
请注意,在任何情况下,如果 numpy 行为 w.r.t.,我都不会感到惊讶.解释可迭代对象(这是您想要使用的对象,对吗?)依赖于 numpy 版本.并且可能是越野车.或者其中一些错误实际上是功能.无论如何,当 numpy 版本更改时,我会担心损坏.
相反,复制到预先创建的数组中应该更加健壮.
NumPy is really helpful when creating arrays. If the first argument for numpy.array
has a __getitem__
and __len__
method these are used on the basis that it might be a valid sequence.
Unfortunatly I want to create an array containing dtype=object
without NumPy being "helpful".
Broken down to a minimal example the class would like this:
import numpy as np
class Test(object):
def __init__(self, iterable):
self.data = iterable
def __getitem__(self, idx):
return self.data[idx]
def __len__(self):
return len(self.data)
def __repr__(self):
return '{}({})'.format(self.__class__.__name__, self.data)
and if the "iterables" have different lengths everything is fine and I get exactly the result I want to have:
>>> np.array([Test([1,2,3]), Test([3,2])], dtype=object)
array([Test([1, 2, 3]), Test([3, 2])], dtype=object)
but NumPy creates a multidimensional array if these happen to have the same length:
>>> np.array([Test([1,2,3]), Test([3,2,1])], dtype=object)
array([[1, 2, 3],
[3, 2, 1]], dtype=object)
Unfortunatly there is only a ndmin
argument so I was wondering if there is a way to enforce a ndmax
or somehow prevent NumPy from interpreting the custom classes as another dimension (without deleting __len__
or __getitem__
)?
A workaround is of course to create an array of the desired shape and then copy the data:
In [19]: lst = [Test([1, 2, 3]), Test([3, 2, 1])]
In [20]: arr = np.empty(len(lst), dtype=object)
In [21]: arr[:] = lst[:]
In [22]: arr
Out[22]: array([Test([1, 2, 3]), Test([3, 2, 1])], dtype=object)
Notice that in any case I would not be surprised if numpy behavior w.r.t. interpreting iterable objects (which is what you want to use, right?) is numpy version dependent. And possibly buggy. Or maybe some of these bugs are actually features. Anyway, I'd be wary of breakage when a numpy version changes.
On the contrary, copying into a pre-created array should be way more robust.
这篇关于防止numpy创建多维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!