为什么Numpy仅具有(n,)维而不是(n,1)维 [英] Why Numpy has dimension (n,) instead of (n,1) only

查看:198
本文介绍了为什么Numpy仅具有(n,)维而不是(n,1)维的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对此一直感到好奇.我可以忍受,但是如果没有足够的照顾,它总是会咬我,所以我决定将其张贴在这里.假设以下示例(Numpy版本= 1.8.2):

I have been curious about this for some time. I can live with that, but it always bites me when enough care is not taken, so I decide to post it here. Suppose the following example (Numpy version = 1.8.2):

a = array([[0, 1], [2, 3]])
print shape(a[0:0, :]) # (0, 2)
print shape(a[0:1, :]) # (1, 2)
print shape(a[0:2, :]) # (2, 2)
print shape(a[0:100, :]) # (2, 2)

print shape(a[0]) # (2, )
print shape(a[0, :]) # (2, )
print shape(a[:, 0]) # (2, )

我不知道其他人的感觉,但是结果对我来说并不一致.最后一行是列向量,而倒数第二行是行向量,它们应该具有不同的维数-在线性代数中,它们的确是这样! (第5行是另一个惊喜,但我暂时将其忽略).考虑第二个示例:

I don't know how other people feel, but the result feels inconsistent to me. The last line is a column vector while the second to last line is a row vector, they should have different dimension -- in linear algebra they do! (Line 5 is another surprise, but I will neglect it for now). Consider a second example:

solution = scipy.sparse.linalg.dsolve.linsolve.spsolve(A, b) # solution of dimension (n, )
analytic = reshape(f(x, y), (n, 1)) # analytic of dimension (n, 1)
error = solution - analytic

现在,误差为(n,n)维.是的,在第二行中,我应该使用(n,)而不是(n,1),但是为什么呢?我以前经常使用MATLAB,其中一维向量的维数为(n,1),linspace/arange返回的维数数组为(n,1),而永远不存在(n,).但是在Numpy(n,1)和(n,)共存的情况下,有很多单独的尺寸处理功能:至少,换新轴和不同的重塑用途,但对我而言,这些功能比帮助更令人困惑.如果数组打印为[1,2,3],则直观地看,尺寸应为[1,3],而不是[3,],对吗?如果Numpy没有(n,),则只能看到清晰度的提高,而不能看到功能的丧失.

Now error is of dimension (n, n). Yes, in the second line I should use (n, ) instead of (n, 1), but why? I used to use MATLAB a lot, where one-d vector has dimension (n, 1), linspace/arange returns array of dimension (n, 1), and there never exists (n, ). But in Numpy (n, 1) and (n, ) coexist, and there are many functions for dimension handling alone: atleast, newaxis and different uses of reshape, but to me those functions are more of confusion than help. If an array print like [1,2,3], then intuitively the dimension should be [1,3] instead of [3,], right? If Numpy does not have (n, ), I can only see a gain in clarity, not a loss in functionality.

因此,这背后必须有一些设计原因.我一直在搜寻,没有找到明确的答案或报告.有人可以帮助您澄清这种混乱情况,或者提供一些有用的参考资料吗?非常感谢您的帮助.

So there must be some design reason behind this. I have been searching from time to time, without finding a clear answer or report. Could someone help clarifying this confusion or provide me some useful references? Your help is much appreciated.

推荐答案

numpy的思想不是一般情况下a[:, 0]是列向量",而a[0, :]是行向量".相反,它们都是很简单的向量,即一维且只有一维的数组.这实际上是高度逻辑和一致的(但是的,对于习惯于Matlab的我们这些人可能会感到恼火).

numpy's philosphy is not that a[:, 0] is a "column vector" and a[0, :] a "row vector" in the general case. Rather they are both, quite simply, vectors—i.e. arrays with one and only one dimension. This is actually highly logical and consistent (but yes, can get annoying for those of us accustomed to Matlab).

之所以说一般",是因为numpy最通用的数据结构array是正确的,它用于各种多维密集型数据存储和处理应用程序,而不仅仅是矩阵数学.具有行"和列"是数组操作的高度专门化的上下文,但是,是的,这是一个非常常见的上下文:这就是numpy也提供matrix类的原因.将您的数组转换为numpy.matrix(或使用matrix构造函数而不是array开头),您将看到更接近您期望的行为.有关更多信息,请参见数组和矩阵之间的区别是什么?我应该使用哪一个?

I say "in the general case" because that is true for numpy's most general data structure, the array, which is intended for all kinds of multi-dimensional dense data storage and manipulation applications—not just matrix math. Having "rows" and "columns" is a highly specialized context for array operations—but yes, a very common one: that's why numpy also supplies the matrix class. Convert your array to a numpy.matrix (or use the matrix constructor instead of array to begin with) and you will see behaviour closer to what you expect. For more information, see What are the differences between numpy arrays and matrices? Which one should I use?

对于要处理2个以上尺寸的情况,请查看numpy.expand_dims函数.尽管语法令人厌烦且冗长,但当我处理具有2个以上维的数组(因此不能使用matrix)时,我永远不得不使用expand_dims来做这种事情:

For cases where you're dealing with more than 2 dimensions, take a look at the numpy.expand_dims function. Though the syntax is annoyingly redundant and unpythonically verbose, when I'm working on arrays with more than 2 dimensions (so cannot use matrix), I'm forever having to use expand_dims to do this kind of thing:

A -= numpy.expand_dims( A.mean( axis=2 ), 2 )   # subtract mean-across-layers from A

代替

A -= A.mean( axis=2 )   # throw an exception while naively attempting to subtract mean-across-layers from A

但是相比之下,请考虑Matlab. Matlab隐含地断言,一维对象不存在,并且事物可以拥有的最小维数为2.当然,您和我都非常习惯于此,但是花点时间来认识一下如何做到任意它是.从根本上说,一维对象和一个二维对象恰好在其一个维度上具有范围1之间存在明显的概念差异:允许后者在其第二维度上增长,而前者不允许在第二维度上增长.甚至不知道第二维的含义,为什么?因此,a.shape==(N,)a.shape==(N,1)作为单独的案例是很有意义的.您可能还会问:为什么不是(N, 1, 1)?"或为什么不是(N, 1, 1, 1, 1, 1, 1)?"

But consider Matlab, by contrast. Matlab implicitly asserts that there is no such thing as a one-dimensional object and that the minimum number of dimensions a thing can ever have is 2. Sure, you and I are both highly accustomed to this, but take a moment to realize how arbitrary it is. There is clearly a conceptual difference between a fundamentally one-dimensional object, and a two-dimensional object that just happens to have extent 1 in one of its dimensions: the latter is allowed to grow in its second dimension, whereas the former doesn't even know what the second dimension means—and why should it? Hence a.shape==(N,) and a.shape==(N,1) make perfect sense as separate cases. You might as well ask "why is it not (N, 1, 1)?" or "why is it not (N, 1, 1, 1, 1, 1, 1)?"

这篇关于为什么Numpy仅具有(n,)维而不是(n,1)维的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆