numpy:使用重塑形状或换新轴添加尺寸 [英] Numpy: use reshape or newaxis to add dimensions

查看:134
本文介绍了numpy:使用重塑形状或换新轴添加尺寸的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

ndarray.reshapenumpy.newaxis均可用于向数组添加新维度.他们俩似乎都在创建视图,是否有理由或优势使用一个视图而不是另一个视图?

Either ndarray.reshape or numpy.newaxis can be used to add a new dimension to an array. They both seem to create a view, is there any reason or advantage to use one instead of the other?

>>> b
array([ 1.,  1.,  1.,  1.])
>>> c = b.reshape((1,4))
>>> c *= 2
>>> c
array([[ 2.,  2.,  2.,  2.]])
>>> c.shape
(1, 4)
>>> b
array([ 2.,  2.,  2.,  2.])
>>> d = b[np.newaxis,...]
>>> d
array([[ 2.,  2.,  2.,  2.]])
>>> d.shape
(1, 4)
>>> d *= 2
>>> b
array([ 4.,  4.,  4.,  4.])
>>> c
array([[ 4.,  4.,  4.,  4.]])
>>> d
array([[ 4.,  4.,  4.,  4.]])
>>> 

`

推荐答案

我看不出有太大区别的证据.您可以对非常大的阵列进行时间测试.基本上,两者都摆弄形状,甚至大步向前. __array_interface__是访问此信息的好方法.例如:

I don't see evidence of much difference. You could do a time test on very large arrays. Basically both fiddle with the shape, and possibly the strides. __array_interface__ is a nice way of accessing this information. For example:

In [94]: b.__array_interface__
Out[94]: 
{'data': (162400368, False),
 'descr': [('', '<f8')],
 'shape': (5,),
 'strides': None,
 'typestr': '<f8',
 'version': 3}

In [95]: b[None,:].__array_interface__
Out[95]: 
{'data': (162400368, False),
 'descr': [('', '<f8')],
 'shape': (1, 5),
 'strides': (0, 8),
 'typestr': '<f8',
 'version': 3}

In [96]: b.reshape(1,5).__array_interface__
Out[96]: 
{'data': (162400368, False),
 'descr': [('', '<f8')],
 'shape': (1, 5),
 'strides': None,
 'typestr': '<f8',
 'version': 3}

两者都使用与原始缓冲区相同的data缓冲区来创建视图.形状相同,但重塑不会更改strides. reshape可让您指定order.

Both create a view, using the same data buffer as the original. Same shape, but reshape doesn't change the strides. reshape lets you specify the order.

并且.flags显示C_CONTIGUOUS标志中的差异.

And .flags shows differences in the C_CONTIGUOUS flag.

reshape可能会更快,因为它所做的更改较少.但是,无论哪种方式,该操作都不会对较大的计算时间产生太大影响.

reshape may be faster because it is making fewer changes. But either way the operation shouldn't affect the time of larger calculations much.

例如对于大型b

In [123]: timeit np.outer(b.reshape(1,-1),b)
1 loops, best of 3: 288 ms per loop
In [124]: timeit np.outer(b[None,:],b)
1 loops, best of 3: 287 ms per loop


有趣的发现:b.reshape(1,4).strides -> (32, 8)

这是我的猜测. .__array_interface__显示基础属性,而.strides更像是一个属性(尽管它可能全部埋在C代码中).默认基础值是None,当需要计算(或显示为.strides)时,它将根据形状和商品尺寸进行计算. 32是到第一行末尾的距离(4x8). np.ones((2,4)).strides具有相同的(32,8)(和__array_interface__中的None.

Here's my guess. .__array_interface__ is displaying an underlying attribute, and .strides is more like a property (though it may all be buried in C code). The default underlying value is None, and when needed for calculation (or display with .strides) it calculates it from the shape and item size. 32 is the distance to the end of the 1st row (4x8). np.ones((2,4)).strides has the same (32,8) (and None in __array_interface__.

b[None,:]正在准备进行广播的阵列.广播时,将重复使用现有值.这就是(0,8)中的0所做的.

b[None,:] on the other hand is preparing the array for broadcasting. When broadcasted, existing values are used repeatedly. That's what the 0 in (0,8) does.

In [147]: b1=np.broadcast_arrays(b,np.zeros((2,1)))[0]

In [148]: b1.shape
Out[148]: (2, 5000)

In [149]: b1.strides
Out[149]: (0, 8)

In [150]: b1.__array_interface__
Out[150]: 
{'data': (3023336880L, False),
 'descr': [('', '<f8')],
 'shape': (2, 5),
 'strides': (0, 8),
 'typestr': '<f8',
 'version': 3}

b1显示与np.ones((2,5))相同,但只有5个项目.

b1 displays the same as np.ones((2,5)) but has only 5 items.

np.broadcast_arrays/numpy/lib/stride_tricks.py中的功能.它使用同一文件中的as_strided.这些功能直接与shape和stride属性一起使用.

np.broadcast_arrays is a function in /numpy/lib/stride_tricks.py. It uses as_strided from the same file. These functions directly play with the shape and strides attributes.

这篇关于numpy:使用重塑形状或换新轴添加尺寸的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆