结合了“对象数组"和“对象数组"的特征 [英] combining features of 'array of objects' with 'object of arrays'

查看:158
本文介绍了结合了“对象数组"和“对象数组"的特征的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找某种范例或实现方式来有效地处理多组耦合的N维数组(ndarray s).具体来说,我希望有一种实现方式,允许我同时分割整个对象(例如someObjs = objects[100:200])或这些对象的单个属性(例如somePars1 = objects.par1[100:200])的数组.

I'm looking for some sort of paradigm or implementation to efficiently handle many sets of coupled N-dim arrays (ndarrays). Specifically, I'm hoping for an implementation that allows me to slice an array of entire objects (e.g. someObjs = objects[100:200]), or individual attributes of those objects (e.g. somePars1 = objects.par1[100:200]) --- at the same time.

为了扩展上面的示例,我可以通过两种方式构造以下子集:

To expand on the above example, I could construct the following subsets in two ways:

def subset1(objects, beg, end):
    pars1 = [ obj.par1 for obj in objects[beg:end] ]
    pars2 = [ obj.par2 for obj in objects[beg:end] ]
    return pars1, pars2

def subset2(objects, beg, end):
    pars1 = objects.par1[beg:end]
    pars2 = objects.par2[beg:end]
    return pars1, pars2

它们将是相同的.

一种方法是覆盖__getitem__(等)方法,例如

One approach would be to override the __getitem__ (etc) methods, something like,

class Objects(object):
    def __init__(self, p1, p2):
        self.par1 = p1
        self.par2 = p2
    ...
    def __getitem__(self, key):
        return Objects(self.p1[key], self.p2[key])

但是这是非常低效的,它复制了子集.也许有某种方式可以返回子集的view?

But this is horribly inefficient, and it duplicates the subset. Perhaps there's someway to return a view of the subset??

推荐答案

对象数组和带有数组方法的对象

样本对象类

In [56]: class MyObj(object):
   ....:     def __init__(self, par1,par2):
   ....:         self.par1=par1
   ....:         self.par2=par2

这些对象的数组-仅是带有数组包装器的列表

An array of those objects - little more than a list with an array wrapper

In [57]: objects=np.array([MyObj(1,2),MyObj(3,4),MyObj(2,3),MyObj(10,11)])
In [58]: objects
Out[58]: 
array([<__main__.MyObj object at 0xb31b196c>,
       <__main__.MyObj object at 0xb31b116c>,
       <__main__.MyObj object at 0xb31b13cc>,
       <__main__.MyObj object at 0xb31b130c>], dtype=object)

`subset``选择类型:

`subset`` type of selection:

In [59]: [obj.par1 for obj in objects[1:-1]]
Out[59]: [3, 2]

另一个可以包含此类数组的类.这比定义数组子类更简单:

Another class that can contain such an array. This is simpler than defining an array subclass:

In [60]: class MyObjs(object):
   ....:     def __init__(self,anArray):
   ....:         self.data=anArray
   ....:     def par1(self):
   ....:         return [obj.par1 for obj in self.data]

In [61]: Obs = MyObjs(objects)
In [62]: Obs.par1()
Out[62]: [1, 3, 2, 10]

subset2选择类型:

In [63]: Obs.par1()[1:-1]
Out[63]: [3, 2]

目前,par1是一种方法,但可以设置属性,允许使用Obs.par1[1:-1]语法.

For now par1 is a method, but could made a property, permitting Obs.par1[1:-1] syntax.

如果par1返回的是数组而不是列表,则索引将更强大.

If par1 returned an array instead of a list, indexing would be more powerful.

如果MyObjs具有__getitem__方法,则可以使用

If MyObjs had a __getitem__ method, then it could be indexed with

Obs[1:-1]

该方法可以通过多种方式定义,尽管最简单的方法是将索引切片"应用于数据":

That method could be defined in various ways, though the simplest is to apply the indexing 'slice' to the 'data':

def __getitem__(self, *args):
    # not tested
    return MyObjs(self.data.__getitem(*args))

我只关注语法,而不关注效率.通常,通用对象的numpy数组不是非常快速或强大.这样的数组基本上是指向对象的指针的列表.

I'm focusing just on syntax, not on efficiency. In general numpy arrays of general objects is not very fast or powerful. Such arrays are basically lists of pointers to the objects.

结构化数组和recarray版本

另一个可能性是np.recarray.另一个海报只是问他们的名字.它们本质上是结构化的数组,可以在其中将字段作为属性进行访问.

Another possiblity is np.recarray. Another poster was just asking about their names. They essentially are structured array where fields can be accessed as attributes.

具有结构化数组定义:

In [64]: dt = np.dtype([('par1', int), ('par2', int)])
In [66]: Obj1 = np.array([(1,2),(3,4),(2,3),(10,11)], dtype=dt)
In [67]: Obj1
Out[67]: 
array([(1, 2), (3, 4), (2, 3), (10, 11)], 
      dtype=[('par1', '<i4'), ('par2', '<i4')])
In [68]: Obj1['par1'][1:-1]
Out[68]: array([3, 2])
In [69]: Obj1[1:-1]['par1']
Out[69]: array([3, 2])

或作为recarray

or as recarray

In [79]: Objrec=np.rec.fromrecords(Obj1,dtype=dt)
In [80]: Objrec.par1
Out[80]: array([ 1,  3,  2, 10])
In [81]: Objrec.par1[1:-1]
Out[81]: array([3, 2])
In [82]: Objrec[1:-1].par1
Out[82]: array([3, 2])

这篇关于结合了“对象数组"和“对象数组"的特征的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆