numpy.array形状(R,1)和(R,)之间的区别 [英] Difference between numpy.array shape (R, 1) and (R,)

查看:125
本文介绍了numpy.array形状(R,1)和(R,)之间的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

numpy中,某些操作以形状(R, 1)返回,但有些返回(R,).由于需要显式reshape,这将使矩阵乘法更加繁琐.例如,给定矩阵M,如果我们要执行numpy.dot(M[:,0], numpy.ones((1, R))),其中R是行数(当然,同样的问题也会逐列出现).因为M[:,0]的形状为(R,),但numpy.ones((1, R))的形状为(1, R),我们将得到matrices are not aligned错误.

In numpy, some of the operations return in shape (R, 1) but some return (R,). This will make matrix multiplication more tedious since explicit reshape is required. For example, given a matrix M, if we want to do numpy.dot(M[:,0], numpy.ones((1, R))) where R is the number of rows (of course, the same issue also occurs column-wise). We will get matrices are not aligned error since M[:,0] is in shape (R,) but numpy.ones((1, R)) is in shape (1, R).

所以我的问题是:

  1. 形状(R, 1)(R,)有什么区别.我从字面上知道它是数字列表和列表列表,其中所有列表仅包含一个数字.只是想知道为什么不设计numpy以便它倾向于使用形状(R, 1)而不是(R,)以便于矩阵乘法.

  1. What's the difference between shape (R, 1) and (R,). I know literally it's list of numbers and list of lists where all list contains only a number. Just wondering why not design numpy so that it favors shape (R, 1) instead of (R,) for easier matrix multiplication.

以上示例是否有更好的方法?无需像这样显式重塑:numpy.dot(M[:,0].reshape(R, 1), numpy.ones((1, R)))

Are there better ways for the above example? Without explicitly reshape like this: numpy.dot(M[:,0].reshape(R, 1), numpy.ones((1, R)))

推荐答案

1. NumPy中形状的含义

您写道:我从字面上知道这是一个数字列表和一个列表列表,其中所有列表都只包含一个数字",但这是一种无益的思考方式.

1. The meaning of shapes in NumPy

You write, "I know literally it's list of numbers and list of lists where all list contains only a number" but that's a bit of an unhelpful way to think about it.

考虑NumPy数组的最好方法是它们由两部分组成:一个数据缓冲区(它只是原始元素的一个块)和一个 view ,它描述了如何解释数据缓冲区.

The best way to think about NumPy arrays is that they consist of two parts, a data buffer which is just a block of raw elements, and a view which describes how to interpret the data buffer.

例如,如果我们创建一个由12个整数组成的数组:

For example, if we create an array of 12 integers:

>>> a = numpy.arange(12)
>>> a
array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

然后a由一个数据缓冲区组成,其排列方式如下:

Then a consists of a data buffer, arranged something like this:

┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

和描述如何解释数据的视图:

and a view which describes how to interpret the data:

>>> a.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> a.dtype
dtype('int64')
>>> a.itemsize
8
>>> a.strides
(8,)
>>> a.shape
(12,)

在这里 shape (12,)表示该数组由一个从0到11的索引建立索引.从概念上讲,如果我们标记此单个索引i,则数组a看起来像这样:

Here the shape (12,) means the array is indexed by a single index which runs from 0 to 11. Conceptually, if we label this single index i, the array a looks like this:

i= 0    1    2    3    4    5    6    7    8    9   10   11
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

如果我们重塑一个数组,则不会不要更改数据缓冲区.相反,它创建一个新视图,该视图描述了另一种解释数据的方式.所以之后:

If we reshape an array, this doesn't change the data buffer. Instead, it creates a new view that describes a different way to interpret the data. So after:

>>> b = a.reshape((3, 4))

数组b具有与a相同的数据缓冲区,但是现在它由两个索引,分别从0到2和0到3.如果我们标记两个索引ij,则数组b看起来像这样:

the array b has the same data buffer as a, but now it is indexed by two indices which run from 0 to 2 and 0 to 3 respectively. If we label the two indices i and j, the array b looks like this:

i= 0    0    0    0    1    1    1    1    2    2    2    2
j= 0    1    2    3    0    1    2    3    0    1    2    3
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

这意味着:

>>> b[2,1]
9

您可以看到第二个索引变化很快,而第一个索引变化缓慢.如果您不希望这样做,可以指定order参数:

You can see that the second index changes quickly and the first index changes slowly. If you prefer this to be the other way round, you can specify the order parameter:

>>> c = a.reshape((3, 4), order='F')

这会导致索引如下的数组:

which results in an array indexed like this:

i= 0    1    2    0    1    2    0    1    2    0    1    2
j= 0    0    0    1    1    1    2    2    2    3    3    3
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

这意味着:

>>> c[2,1]
5

现在应该清楚一个数组具有一个或多个尺寸为1的尺寸的形状是什么意思.

It should now be clear what it means for an array to have a shape with one or more dimensions of size 1. After:

>>> d = a.reshape((12, 1))

数组d由两个索引索引,其中第一个索引从0到11,第二个索引始终为0:

the array d is indexed by two indices, the first of which runs from 0 to 11, and the second index is always 0:

i= 0    1    2    3    4    5    6    7    8    9   10   11
j= 0    0    0    0    0    0    0    0    0    0    0    0
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

等等:

>>> d[10,0]
10

长度为1的尺寸是自由的"(在某种意义上),因此没有什么可以阻止您去城镇:

A dimension of length 1 is "free" (in some sense), so there's nothing stopping you from going to town:

>>> e = a.reshape((1, 2, 1, 6, 1))

给出一个索引如下的数组:

giving an array indexed like this:

i= 0    0    0    0    0    0    0    0    0    0    0    0
j= 0    0    0    0    0    0    1    1    1    1    1    1
k= 0    0    0    0    0    0    0    0    0    0    0    0
l= 0    1    2    3    4    5    0    1    2    3    4    5
m= 0    0    0    0    0    0    0    0    0    0    0    0
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│  0 │  1 │  2 │  3 │  4 │  5 │  6 │  7 │  8 │  9 │ 10 │ 11 │
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

等等:

>>> e[0,1,0,0,0]
6

有关如何使用数组的更多详细信息,请参见 NumPy内部文档.已实施.

See the NumPy internals documentation for more details about how arrays are implemented.

由于 numpy.reshape 只会创建一个新视图,您不必在必要时使用它.当您想以其他方式为数组建立索引时,它是正确的工具.

Since numpy.reshape just creates a new view, you shouldn't be scared about using it whenever necessary. It's the right tool to use when you want to index an array in a different way.

但是,在较长的计算中,通常可以安排首先构造具有正确"形状的数组,从而最大程度地减少了重塑和转置的次数.但是,在没有看到导致需要重塑的实际环境的情况下,很难说应该改变什么.

However, in a long computation it's usually possible to arrange to construct arrays with the "right" shape in the first place, and so minimize the number of reshapes and transposes. But without seeing the actual context that led to the need for a reshape, it's hard to say what should be changed.

您问题中的示例是:

numpy.dot(M[:,0], numpy.ones((1, R)))

但这是不现实的.首先,此表达式:

but this is not realistic. First, this expression:

M[:,0].sum()

更简单地计算结果.第二,第0列真的有什么特别之处吗?也许您实际需要的是:

computes the result more simply. Second, is there really something special about column 0? Perhaps what you actually need is:

M.sum(axis=0)

这篇关于numpy.array形状(R,1)和(R,)之间的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆