在numpy中,用空元组和省略号索引数组的作用是什么? [英] In numpy, what does indexing an array with the empty tuple vs. ellipsis do?

查看:78
本文介绍了在numpy中,用空元组和省略号索引数组的作用是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我只是偶然地发现numpy中的一个数组可能被一个空的元组索引:

In [62]: a = arange(5)

In [63]: a[()]
Out[63]: array([0, 1, 2, 3, 4])

我在 numpy Wiki ZeroRankArray 中找到了一些文档:

(Sasha)首先,无论对x [...]和x [()]做出什么选择,它们都应该相同,因为...只是尽可能多:根据需要"的语法糖,在...中零等级的情况导致... =(:,)* 0 =().其次,零级数组和numpy标量类型在numpy中可以互换,但是numpy标量可以在ndarrays无法使用的某些python构造中使用.

因此,对于0维数组,a[()]a[...]应该等效.它们也适用于高维数组吗?它们似乎强烈地是:

In [65]: a = arange(25).reshape(5, 5)

In [66]: a[()] is a[...]
Out[66]: False

In [67]: (a[()] == a[...]).all()
Out[67]: True

In [68]: a = arange(3**7).reshape((3,)*7)

In [69]: (a[()] == a[...]).all()
Out[69]: True

但是,它不是 语法糖.不适用于高维数组,甚至不适用于0维数组:

In [76]: a[()] is a
Out[76]: False

In [77]: a[...] is a
Out[77]: True

In [79]: b = array(0)

In [80]: b[()] is b
Out[80]: False

In [81]: b[...] is b
Out[81]: True

然后是通过空的 list 进行索引的情况,该操作可以完全执行其他操作,但看起来等同于使用空的ndarray进行索引:

In [78]: a[[]]
Out[78]: array([], shape=(0, 3, 3, 3, 3, 3, 3), dtype=int64)

In [86]: a[arange(0)]
Out[86]: array([], shape=(0, 3, 3, 3, 3, 3, 3), dtype=int64)

In [82]: b[[]]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)

IndexError: 0-d arrays can't be indexed.

因此,()...似乎很相似,但并不完全相同,而使用[]进行索引意味着完全不同.而a[]b[]SyntaxError.在索引数组中记录了使用列表建立索引的情况,并且有一个关于元组索引的简短通知空的`()在Matlab矩阵上有什么作用?)

实际上,甚至标量都可能由一个空的元组建立索引:

In [36]: numpy.int64(10)[()]
Out[36]: 10

解决方案

A[...]的处理是一个特例,已优化为 https://github.com/numpy/numpy/commit/fa547b80f7035da85f66f9cbabc4ff75969d23cd it似乎最初是必需的,因为使用...进行索引在0d数组上无法正常工作(以前是 https://github.com/numpy/numpy/commit/4156b241aa3670f923428d4e72577a9962cdf042 它将以标量形式返回元素),然后扩展到所有数组以保持一致性;从那时起,索引已固定在0d数组上,因此不需要进行优化,但可以设法保留痕迹(可能有些代码依赖A[...] is A为真).

I just discovered — by chance — that an array in numpy may be indexed by an empty tuple:

In [62]: a = arange(5)

In [63]: a[()]
Out[63]: array([0, 1, 2, 3, 4])

I found some documentation on the numpy wiki ZeroRankArray:

(Sasha) First, whatever choice is made for x[...] and x[()] they should be the same because ... is just syntactic sugar for "as many : as necessary", which in the case of zero rank leads to ... = (:,)*0 = (). Second, rank zero arrays and numpy scalar types are interchangeable within numpy, but numpy scalars can be use in some python constructs where ndarrays can't.

So, for 0-d arrays a[()] and a[...] are supposed to be equivalent. Are they for higher-dimensional arrays, too? They strongly appear to be:

In [65]: a = arange(25).reshape(5, 5)

In [66]: a[()] is a[...]
Out[66]: False

In [67]: (a[()] == a[...]).all()
Out[67]: True

In [68]: a = arange(3**7).reshape((3,)*7)

In [69]: (a[()] == a[...]).all()
Out[69]: True

But, it is not syntactic sugar. Not for a high-dimensional array, and not even for a 0-d array:

In [76]: a[()] is a
Out[76]: False

In [77]: a[...] is a
Out[77]: True

In [79]: b = array(0)

In [80]: b[()] is b
Out[80]: False

In [81]: b[...] is b
Out[81]: True

And then there is the case of indexing by an empty list, which does something else altogether, but appears equivalent to indexing with an empty ndarray:

In [78]: a[[]]
Out[78]: array([], shape=(0, 3, 3, 3, 3, 3, 3), dtype=int64)

In [86]: a[arange(0)]
Out[86]: array([], shape=(0, 3, 3, 3, 3, 3, 3), dtype=int64)

In [82]: b[[]]
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)

IndexError: 0-d arrays can't be indexed.

So, it appears that () and ... are similar but not quite identical and indexing with [] means something else altogether. And a[] or b[] are SyntaxErrors. Indexing with lists is documented at index arrays, and there is a short notice about indexing with tuples at the end of the same document.

That leaves the question:

Is the difference between a[()] and a[...] by design? What is the design, then?

(Question somehow reminiscent of: What does the empty `()` do on a Matlab matrix?)

Edit:

In fact, even scalars may be indexed by an empty tuple:

In [36]: numpy.int64(10)[()]
Out[36]: 10

解决方案

The treatment of A[...] is a special case, optimised to always return A itself:

if (op == Py_Ellipsis) {
    Py_INCREF(self);
    return (PyObject *)self;
}

Anything else that should be equivalent e.g. A[:], A[(Ellipsis,)], A[()], A[(slice(None),) * A.ndim] will instead return a view of the entirety of A, whose base is A:

>>> A[()] is A
False
>>> A[()].base is A
True

This seems an unnecessary and premature optimisation, as A[(Ellipsis,)] and A[()] will always give the same result (an entire view on A). From looking at https://github.com/numpy/numpy/commit/fa547b80f7035da85f66f9cbabc4ff75969d23cd it seems that it was originally required because indexing with ... didn't work properly on 0d arrays (previously to https://github.com/numpy/numpy/commit/4156b241aa3670f923428d4e72577a9962cdf042 it would return the element as a scalar), then extended to all arrays for consistency; since then, indexing has been fixed on 0d arrays so the optimisation isn't required, but it's managed to stick around vestigially (and there's probably some code that depends on A[...] is A being true).

这篇关于在numpy中,用空元组和省略号索引数组的作用是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆