应用ufunc后在容器内维护numpy子类 [英] Maintaining numpy subclass inside a container after applying ufunc

查看:78
本文介绍了应用ufunc后在容器内维护numpy子类的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我按照创建了一个从numpy的ndarray派生的类. numpy的文档,它看起来像(减少属性的数量以使其更具可读性):

I have created a class derived from numpy's ndarray following numpy's documentation, and it looks like (reduce the number of attributes to make it more readable):

import numpy as np

class Atom3D( np.ndarray ):
    __array_priority__ = 11.0

    def __new__( cls, idnum, coordinates):

        # Cast numpy to be our class type
        assert len(coordinates) == 3
        obj = np.asarray(coordinates, dtype= np.float64).view(cls)
        # add the new attribute to the created instance
        obj._number = int(idnum)
        # Finally, we must return the newly created object:
        return obj

    def __array_finalize__( self, obj ):
        self._number = getattr(obj, '_number', None)

    def __array_wrap__( self, out_arr, context=None ):
        return np.ndarray.__array_wrap__(self, out_arr, context)

    def __repr__( self ):
        return "{0._number}: ({0[0]:8.3f}, {0[1]:8.3f}, {0[2]:9.3f})".format(self)

执行测试时,我将numpy的ufunc应用于对象:

When I execute a test in which I apply a numpy's ufunc to the object:

a1 = Atom3D(1, [5., 5., 5.])
print type(a1), repr(a1)
m  = np.identity(3)
a2 = np.dot(a1, m)
print type(a2), repr(a2)

我获得了预期的结果;也就是说,点函数保留了对象的子类:

I obtain the expected result; that is, the dot function keeps the subclassing of the object:

<class '__main__.Atom3D'> 1: (   5.000,    5.000,     5.000)  
<class '__main__.Atom3D'> 1: (   5.000,    5.000,     5.000)

但是,当我尝试将相同的np.dot应用于这些对象的数组时,子类将丢失.因此,执行:

But, when I try to apply the same np.dot to an array of these objects, the subclass is lost. Thus, executing:

print "regular"
atom_list1 = [a1, a2, a3]
atom_list2 = np.dot(atom_list1, m)
for _ in atom_list2:
    print type(_), repr(_)

print "numpy array"
atom_list1 = np.array([a1, a2, a3], dtype=np.object)
atom_list2 = np.dot(atom_list1, m)
for _ in atom_list2:
    print type(_), repr(_)

给我这个:

regular
<type 'numpy.ndarray'> array([ 5.,  5.,  5.])
<type 'numpy.ndarray'> array([ 6.,  4.,  2.])
<type 'numpy.ndarray'> array([ 8.,  6.,  8.])
numpy array
<type 'numpy.ndarray'> array([5.0, 5.0, 5.0], dtype=object)
<type 'numpy.ndarray'> array([6.0, 4.0, 2.0], dtype=object)
<type 'numpy.ndarray'> array([8.0, 6.0, 8.0], dtype=object)

其他操作(如__sub__)也将如此:

The same will go for other operation such as __sub__:

print "regular"
a1 = Atom3D(1, [5., 5., 5.])
a2 = a1 - np.array([3., 2., 0.])
print type(a2), repr(a2)
print "numpy array"
a1 = Atom3D(1, [5., 5., 5.])
a2 = Atom3D(2, [6., 4., 2.])
a3 = Atom3D(3, [8., 6., 8.])
atom_list1 = np.array([a1, a2, a3], dtype=np.object)
atom_list2 = atom_list1 - np.array([3., 2., 0.])
for _ in atom_list2:
    print type(_), repr(_)

将产生产量:

regular
<class '__main__.Atom3D'> 1: (   2.000,    3.000,     5.000)
numpy array
<type 'numpy.ndarray'> array([2.0, 3.0, 5.0], dtype=object)
<type 'numpy.ndarray'> array([3.0, 2.0, 2.0], dtype=object)
<type 'numpy.ndarray'> array([5.0, 4.0, 8.0], dtype=object)

我一直在寻找,但没有发现自己在怎么做.
谢谢!

I've been looking but have not found how I'm I going wrong.
Thanks!!

J .-

推荐答案

没有dtype=Atom3D这样的东西. dtype=listdtype=np.ndarray也是如此.它创建一个dtype=object数组,其中每个元素都是指向内存中其他位置的对象的指针.

There's no such thing as dtype=Atom3D. The same goes for dtype=list and dtype=np.ndarray. It creates a dtype=object array, where each element is a pointer to objects elsewhere in memory.

使用np.array(...)创建对象数组可能很棘手. np.array评估条目并做出一些自己的选择.最好的选择是,如果要对进入对象数组的元素进行绝对控制,那就是创建一个空白"对象,然后自己分配元素.

Creating an object array with np.array(...) can be tricky. np.array evaluates the entries and makes some of its own choices. The best bet, if you want absolute control over the elements that go into an object array is to create an 'blank' one, and assign elements yourself.

In [508]: A=np.array([np.matrix([1,2]),np.matrix([2,1])],dtype=object)

In [509]: A       # a 3d array, no matrix subarrays
Out[509]: 
array([[[1, 2]],

       [[2, 1]]], dtype=object)

In [510]: A=np.empty((2,),dtype=object)

In [511]: A
Out[511]: array([None, None], dtype=object)

In [512]: A[:]=[np.matrix([1,2]),np.matrix([2,1])]

In [513]: A
Out[513]: array([matrix([[1, 2]]), matrix([[2, 1]])], dtype=object)

除非您确实需要对象数组来进行诸如重塑和转置之类的操作,否则通常最好使用列表.

Unless you really need the object array for things like reshaping and transposing, you are usually better off working with a list.

混合对象类型也可以:

In [522]: A=np.asarray([np.matrix([1,2]),np.ma.masked_array([2,1])],dtype=np.object)

In [523]: A
Out[523]: 
array([matrix([[1, 2]]),
       masked_array(data = [2 1],
             mask = False,
       fill_value = 999999)
], dtype=object)

=========================

==========================

当您执行np.dot([a1,a2,a3],m)时,它将首先使用np.asarray([a1,a2,a3])将任何列表转换为数组.结果是2d数组,而不是Atom3d对象的数组.因此,dot是通常的数组点.

When you do np.dot([a1,a2,a3],m), it first turns any lists into an array with np.asarray([a1,a2,a3]). The result is a 2d array, not an array of Atom3d objects. So the dot is the usual array dot.

如果我按照建议创建对象数组:

If I create an object array as I suggested:

In [14]: A=np.empty((3,),dtype=object)
In [16]: A[:]=[a1,a2,a1+a2]

In [17]: A
Out[17]: 
array([1: (   5.000,    5.000,     5.000),
       1: (   5.000,    5.000,     5.000),
       1: (  10.000,   10.000,    10.000)], dtype=object)

In [18]: np.dot(A,m)
Out[18]: 
array([1: (   5.000,    5.000,     5.000),
       1: (   5.000,    5.000,     5.000),
       1: (  10.000,   10.000,    10.000)], dtype=object)

Atom3D类型被保留;

相同的减法:

In [23]: A- np.array([3.,2., 0])
Out[23]: 
array([1: (   2.000,    2.000,     2.000),
       1: (   3.000,    3.000,     3.000),
       1: (  10.000,   10.000,    10.000)], dtype=object)

该数组和Atom3D的添加都可以使用,尽管显示结果有问题:

Addition of that array and an Atom3D works, though there is something wrong with displaying the result:

In [39]: x = A + a2

In [40]: x
Out[40]: <repr(<__main__.Atom3D at 0xb5062294>) failed: TypeError: non-empty format string passed to object.__format__>

使用对象dtype数组进行的计算比较困难.有些工作显然是通过遍历数组的元素,应用函数,然后将结果返回到对象数组来完成的.实际上是

Calculations with object dtype arrays are iffy. Some work, apparently by iterating over the elements of the array, applying the function, and turning the result back into an object array. In effect an array version of

 [func(a, x) for a in A]

即使可以工作,它也不会执行快速的编译操作;它是迭代的(时间将与列表中的类似).

Even if it works, it isn't performing a fast compiled operation; it is iterative (timings will be similar to the list equivalent).

其他事情不起作用

In [41]: a1>0
Out[41]: 1: (   1.000,    1.000,     1.000)

In [42]: A>0
...
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

我们已经多次指出对象dtype数组只不过是荣耀列表.元素与列表一样是指针,因此操作将涉及迭代这些指针-在Python中,而不是C中.这不是numpy代码的高度开发的角落.

We've pointed out many times that object dtype arrays are little more than glorified lists. The elements are pointers as with lists, and thus operations will involve iterating over those pointers - in Python, not C. This is not a highly developed corner of numpy code.

这篇关于应用ufunc后在容器内维护numpy子类的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆