意外的行为numpy数组索引 [英] Unexpected behaviour numpy array indexing

查看:73
本文介绍了意外的行为numpy数组索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以某种意想不到的方式执行特定切片时,numpy数组的形状正在发生变化

The shape of a numpy array is changing when performing specific slicing in a somewhat unexpected manner

我尝试了几种将同一数组切片的方法,但是细微的差异会导致数组形状的结果不同

I have tried several ways of slicing the same array but slight differences lead to different outcomes in the shape of the array

import numpy as np
z = np.zeros((1,9,10,2))

# This makes sense
print(z[...,[1,0]].shape)
# (1, 9, 10, 2)
print(z[0,...].shape)
# (9, 10, 2)
print(z[0:1,...,[1,0]].shape)
# (1, 9, 10, 2)
print(z[0][...,[1,0]].shape)
# (9, 10, 2)

# This doesn't, I would expect (9, 10, 2) in both cases
print(z[0,:,:,[1,0]].shape)
# (2, 9, 10)
print(z[0,...,[1,0]].shape)
# (2, 9, 10)

在最后两个示例中,我不明白为什么最后一个轴移动到第一个位置.

In the last two examples I do not understand why the last axis is moved to the first position.

我将Python 3.6.4numpy 1.15.1一起使用

推荐答案

在最后两种情况下可能会发现结果出乎意料的原因是因为数组的索引遵循

The reason why you might find the result in the two last cases unexpected, is because the indexing of the array is following the rules of advanced indexing, even though you're also indexing with slices.

有关此行为的详细说明,您可以检查

For an extensive explanation behind this behaviour, you can check combining advanced and basic indexing. In these last cases in which you're getting unexpected resulting shapes. In the docs, you'll see that one of the mentioned scenarios in which we might obtain unexpected results is when:

  • 高级索引由切片,省略号或新轴分隔.例如x[arr1, :, arr2].

在您的情况下,尽管您仅使用一个整数沿第一个轴进行索引,但广播该整数并将两个数组都迭代为一个. 在这种情况下,高级索引操作产生的维数首先出现在结果数组中,然后是子空间维数.

In your case, although you're only using an integer for indexing along the first axis, it is broadcasted and both arrays are iterated as one. In this case the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that.

这里的关键是要理解文档中提到的,就像将每个高级索引元素的索引结果连接起来一样.

The key here is to understand that as mentioned in the docs, it is like concatenating the indexing result for each advanced index element.

因此从本质上讲,它的作用与以下操作相同:

So in essence it is doing the same as:

z = np.random.random((1,9,10,2))
a = np.concatenate([z[0,:,:,[1]], z[0,:,:,[0]]], axis=0)
b = z[0,:,:,[1,0]]

np.allclose(a,b)
# True


但是...为什么会发生?

其背后的原因是高级索引和基本索引的行为不同,因为它们具有不同的用途. 让我们通过一个例子来阐明这一点:

The reason behind this is that advanced indexing and basic indexing behave differently given that they serve different purposes. Lets go through an example to make this clear:

a = np.random.randint(1,10, (3,3,4))

print(a)
array([[[8, 2, 7, 2],
        [3, 1, 2, 4],
        [9, 8, 2, 2]],

       [[4, 2, 7, 4],
        [9, 6, 6, 7],
        [4, 2, 5, 1]],

       [[7, 4, 2, 3],
        [6, 9, 3, 6],
        [3, 3, 2, 6]]])

现在说我们要索引a,以便从前两个2d数组中获取最后两个列.为此,方法是基本切片:

Now say we want to index a in order to obtain the two last columns from the first two 2d arrays. Well for this the way to go is with basic slicing:

a[:2,:,2:]

array([[[7, 2],
        [2, 4],
        [2, 2]],

       [[7, 4],
        [6, 7],
        [5, 1]]])

但是,现在,如果我想做同样的事情,而不是选择最后两个列,我希望分别从两个数组中选择第一列和第二列,该怎么办?我该如何处理?为此,我们有了高级索引:

But now, what if I wanted to do the same, but instead of selecting the two last columns, I want the first and then the second from the two arrays respectively? How could I approach that? Well for that we have advanced indexing:

a[[0,1],:,[2,3]]

array([[7, 2, 2],
       [4, 7, 1]])

因此,您可以看到两种索引编制方法在本质上是不同的:

So as you can see both indexing methodologies are fundamentally different:

整数数组索引允许根据数组的N维索引选择数组中的任意项.每个整数数组代表该维度的多个索引

Integer array indexing allows selection of arbitrary items in the array based on their N-dimensional index. Each integer array represents a number of indexes into that dimension

使用多个高级索引时,高级索引始终会广播并迭代为一个,其中获得的结果具有以下形状:

When using several advanced indexes advanced indexes always are broadcast and iterated as one, where the obtained result has the shape:

result[i_1, ..., i_M] == x[ind_1[i_1, ..., i_M], ind_2[i_1, ..., i_M], ind_N[i_1, ..., i_M]]


因此,即使切片包含与数组索引相同数量的轴元素,结果形状也会有所不同.如前所述,这两种索引方法都具有不同的目的.


So the resulting shapes will differ, even though the slices contain the same amount of axis elements as the array indexes. Reason being, that as mentioned, both indexing methodologies serve different purposes.

但是,如果只有一个高级索引,则不会发生这种情况:

This however wouldn't happen if there was a single advanced index:

a[:2,:2,[2,3]]

array([[[7, 2],
        [2, 4],
        [2, 2]],

       [[7, 4],
        [6, 7],
        [5, 1]]])

因为没有其他高级索引可用于广播,因此索引数组充当了沿最后一个轴的切片.

Because there aren't any other advanced indices to broadcast with, and hence the indexing array is acting as a slice along the last axis.

这篇关于意外的行为numpy数组索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆