为什么numpy混合的基本/高级索引依赖于切片邻接? [英] Why does numpy mixed basic / advanced indexing depend on slice adjacency?

查看:80
本文介绍了为什么numpy混合的基本/高级索引依赖于切片邻接?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道之前曾问过类似的问题(

I know similar questions have been asked before (e.g.), but AFAIK nobody has answered my specific question...

我的问题是关于此处:

...需要区分两种索引组合情况:

... Two cases of index combination need to be distinguished:

  • 高级索引由切片,省略号或换轴符分隔.例如x[arr1,:,arr2].
  • 高级索引彼此相邻.例如x[...,arr1,arr2,:]但不是x[arr1,:,1],因为1在这方面是高级索引.
  • The advanced indexes are separated by a slice, ellipsis or newaxis. For example x[arr1,:,arr2].
  • The advanced indexes are all next to each other. For example x[...,arr1,arr2,:] but not x[arr1,:,1] since 1 is an advanced index in this regard.

在第一种情况下,高级索引操作产生的维数首先出现在结果数组中,然后是子空间维数.在第二种情况下,来自高级索引操作的维被插入到结果数组中,其位置与初始数组中的位置相同(后一种逻辑使简单的高级索引表现得像切片一样).

In the first case, the dimensions resulting from the advanced indexing operation come first in the result array, and the subspace dimensions after that. In the second case, the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array (the latter logic is what makes simple advanced indexing behave just like slicing).

为什么需要这种区分?

我期望针对情况2所述的行为将在所有情况下使用.索引是否彼此相邻为什么很重要?

Why is this distinction necessary?

I was expecting the behaviour described for case 2 to be used in all cases. Why does it matter whether indexes are next to each other?

我了解您在某些情况下可能会希望案例1的行为;例如,索引的向量化"将沿着新的维度进行.但是这种行为可以应该由用户定义.也就是说,如果案例2的行为是默认行为,则案例1的行为仅可以使用以下方式实现: x[arr1,:,arr2].reshape((len(arr1),x.shape[1]))

I understand you may want the behaviour of case 1 in some situations; for example, "vectorization" of index results along new dimensions. But this behaviour can and should be defined by the user. That is, if case 2 behaviour was the default, case 1 behaviour would be possible using only: x[arr1,:,arr2].reshape((len(arr1),x.shape[1]))

我知道您可以使用

I know you can achieve the behaviour described in case 2 using np.ix_(), but this inconsistency in default indexing behaviour is unexpected and unjustified, in my opinion. Can someone justify it?

谢谢

推荐答案

案例2的行为情况1的定义不明确句子:

在第二种情况下,来自高级索引操作的维将插入到结果数组中,其位置与初始数组中的位置相同

In the second case, the dimensions from the advanced indexing operations are inserted into the result array at the same spot as they were in the initial array

您可能正在想象输入和输出尺寸之间的一一对应关系,也许是因为您正在想象Matlab样式的索引. NumPy不能那样工作.如果您有四个具有以下形状的数组:

You're probably imagining a one-to-one correspondence between input and output dimensions, perhaps because you're imagining Matlab-style indexing. NumPy doesn't work like that. If you have four arrays with the following shapes:

a.shape == (2, 3, 4, 5, 6)
b.shape == (20, 30)
c.shape == (20, 30)
d.shape == (20, 30)

然后a[b, :, c, :, d]具有四个尺寸,长度分别为3、5、20和30.没有明确的位置放置20和30.NumPy默认将其粘贴在前面.

then a[b, :, c, :, d] has four dimensions, with lengths 3, 5, 20, and 30. There is no unambiguous place to put the 20 and the 30. NumPy defaults to sticking them in front.

另一方面,使用a[:, b, c, d, :]时,20和30可以转到3、4和5的位置,因为3、4和5彼此相邻.新尺寸的整块会移至原始尺寸的整块所在的位置,这仅在原始尺寸位于原始形状的单个块中时有效.

On the other hand, with a[:, b, c, d, :], the 20 and 30 can go where the 3, 4, and 5 were, because the 3, 4, and 5 were next to each other. The whole block of new dimensions goes where the whole block of original dimensions was, which only works if the original dimensions were in a single block in the original shape.

这篇关于为什么numpy混合的基本/高级索引依赖于切片邻接?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆