numpy:为什么(x,1)和(x,)维数之间存在差异 [英] numpy: Why is there a difference between (x,1) and (x, ) dimensionality

查看:334
本文介绍了numpy:为什么(x,1)和(x,)维数之间存在差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道为什么在numpy中有一个维数数组(长度,1),而又有一个维数数组(长度,)而没有第二个值.

I am wondering why in numpy there are one dimensional array of dimension (length, 1) and also one dimensional array of dimension (length, ) w/o a second value.

我经常遇到这种情况,例如使用np.concatenate()时,则需要事先执行reshape步骤(或者我可以直接使用hstack/vstack).

I am running into this quite frequently, e.g. when using np.concatenate() which then requires a reshape step beforehand (or I could directly use hstack/vstack).

我想不出为什么这种行为是可取的.有人可以解释吗?

I can't think of a reason why this behavior is desirable. Can someone explain?

修改:
其中一项评论建议我的问题可能是重复的.我对Numpy的基本工作逻辑更感兴趣,而不是1d和2d数组之间没有区别,我认为这是上述线程的重点.


It was suggested by one of the comments that my question is a possible duplicate. I am more interested in the underlying working logic of Numpy and not that there is a distinction between 1d and 2d arrays which I think is the point of the mentioned thread.

推荐答案

ndarray的数据存储为1d缓冲区-只是一块内存.数组的多维性质由shapestrides属性以及使用它们的代码产生.

The data of a ndarray is stored as a 1d buffer - just a block of memory. The multidimensional nature of the array is produced by the shape and strides attributes, and the code that uses them.

numpy开发人员选择允许任意数量的尺寸,因此形状和步幅表示为任意长度的元组,包括0和1.

The numpy developers chose to allow for an arbitrary number of dimensions, so the shape and strides are represented as tuples of any length, including 0 and 1.

相反,MATLAB是围绕FORTRAN程序构建的,该程序是为矩阵运算而开发的.在早期,MATLAB中的所有内容都是二维矩阵.大约在2000年(v3.5),它被普遍允许超过2d,但永远不会少于2d. numpy np.matrix仍然遵循旧的2d MATLAB约束.

In contrast MATLAB was built around FORTRAN programs that were developed for matrix operations. In the early days everything in MATLAB was a 2d matrix. Around 2000 (v3.5) it was generalized to allow more than 2d, but never less. The numpy np.matrix still follows that old 2d MATLAB constraint.

如果您来自MATLAB世界,那么您会习惯这两个维度以及行向量和列向量之间的区别.但是在不受MATLAB影响的数学和物理学中,向量是一维数组. Python列表本身就是1d,c数组也是如此.要获得2d,您必须具有列表列表或指向数组的指针数组,并具有x[1][2]索引样式.

If you come from a MATLAB world you are used to these 2 dimensions, and the distinction between a row vector and column vector. But in math and physics that isn't influenced by MATLAB, a vector is a 1d array. Python lists are inherently 1d, as are c arrays. To get 2d you have to have lists of lists or arrays of pointers to arrays, with x[1][2] style of indexing.

查看此数组及其变体的形状和步幅:

Look at the shape and strides of this array and its variants:

In [48]: x=np.arange(10)

In [49]: x.shape
Out[49]: (10,)

In [50]: x.strides
Out[50]: (4,)

In [51]: x1=x.reshape(10,1)

In [52]: x1.shape
Out[52]: (10, 1)

In [53]: x1.strides
Out[53]: (4, 4)

In [54]: x2=np.concatenate((x1,x1),axis=1)

In [55]: x2.shape
Out[55]: (10, 2)

In [56]: x2.strides
Out[56]: (8, 4)

MATLAB在最后添加了新尺寸.它像order='F'数组一样排列其值,并且可以很容易地将(n,1)矩阵更改为(n,1,1,1). numpy是默认的order='C',可在开始时轻松扩展数组尺寸.要利用广播,了解这一点至关重要.

MATLAB adds new dimensions at the end. It orders its values like a order='F' array, and can readily change a (n,1) matrix to a (n,1,1,1). numpy is default order='C', and readily expands an array dimension at the start. Understanding this is essential when taking advantage of broadcasting.

因此x1 + x是(10,1)+(10,)=>(10,1)+(1,10)=>(10,10)

Thus x1 + x is a (10,1)+(10,) => (10,1)+(1,10) => (10,10)

因为广播(n,)数组比(n,1)数组更像(1,n)数组.一维数组更像行矩阵而不是列矩阵.

Because of broadcasting a (n,) array is more like a (1,n) one than a (n,1) one. A 1d array is more like a row matrix than a column one.

In [64]: np.matrix(x)
Out[64]: matrix([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

In [65]: _.shape
Out[65]: (1, 10)

concatenate的要点是,它需要匹配的尺寸.它不使用广播来调整尺寸.有许多stack函数可以缓解此约束,但是它们可以通过在使用concatenate之前调整尺寸来实现.查看他们的代码(可读的Python).

The point with concatenate is that it requires matching dimensions. It does not use broadcasting to adjust dimensions. There are a bunch of stack functions that ease this constraint, but they do so by adjusting the dimensions before using concatenate. Look at their code (readable Python).

因此,精通的numpy用户需要熟悉通用的shape元组,包括空的()(0d数组),(n,) 1d及更高版本.对于更高级的东西,理解步幅也会有所帮助(例如,查看转置的步幅和形状).

So a proficient numpy user needs to be comfortable with that generalized shape tuple, including the empty () (0d array), (n,) 1d, and up. For more advanced stuff understanding strides helps as well (look for example at the strides and shape of a transpose).

这篇关于numpy:为什么(x,1)和(x,)维数之间存在差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆