numpy.where(condition) 的输出不是数组,而是数组元组:为什么? [英] output of numpy.where(condition) is not an array, but a tuple of arrays: why?

查看:34
本文介绍了numpy.where(condition) 的输出不是数组,而是数组元组:为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在试验 numpy.where(condition[, x, y]) 函数.
numpy 文档,我了解到如果你只给出一个数组作为输入,它应该返回数组非零的索引(即真"):

<块引用>

如果只给出条件,则返回元组条件.nonzero(),条件为 True 的索引.

但是如果尝试一下,它会返回一个包含两个元素的 元组,其中第一个是想要的索引列表,第二个是空元素:

<预><代码>>>>将 numpy 导入为 np>>>数组 = np.array([1,2,3,4,5,6,7,8,9])>>>np.where(array>4)(array([4, 5, 6, 7, 8]),) # 注意最后一个括号前的逗号

所以问题是:为什么?这种行为的目的是什么?在什么情况下这是有用的?实际上,要获得想要的索引列表,我必须添加索引,如 np.where(array>4)[0] 中所示,这似乎......丑陋".

<小时>

附录

我理解(从一些答案中)它实际上只是一个元素的元组.我仍然不明白为什么要以这种方式给出输出.为了说明这如何不理想,请考虑以下错误(首先激发了我的问题):

<预><代码>>>>将 numpy 导入为 np>>>数组 = np.array([1,2,3,4,5,6,7,8,9])>>>pippo = np.where(array>4)>>>皮波 + 1回溯(最近一次调用最后一次):文件<stdin>",第 1 行,在 <module> 中类型错误:只能将元组(不是int")连接到元组

所以你需要做一些索引来访问实际的索引数组:

<预><代码>>>>皮波[0] + 1数组([5, 6, 7, 8, 9])

解决方案

在 Python 中 (1) 的意思就是 1.() 可以自由添加到组号和表达式中以提高人类可读性(例如 (1+3)*3 v (1+3,)*3).因此,为了表示一个 1 元素元组,它使用 (1,)(并且要求您也使用它).

因此

(数组([4, 5, 6, 7, 8]),)

是一个单元素元组,该元素是一个数组.

如果将 where 应用于二维数组,结果将是一个 2 元素元组.

where 的结果是它可以直接插入索引槽,例如

a[where(a>0)]a[a>0]

应该返回相同的东西

如愿

I,J = where(a>0) # a 是 2d[I,J][(I,J)]

或者用你的例子:

在[278]中:a=np.array([1,2,3,4,5,6,7,8,9])在[279]中:np.where(a>4)Out[279]: (array([4, 5, 6, 7, 8], dtype=int32),) #元组在[280]中:a[np.where(a>4)]出[280]: 数组([5, 6, 7, 8, 9])在[281]中:I=np.where(a>4)在 [282] 中:我出[282]: (数组([4, 5, 6, 7, 8], dtype=int32),)在 [283] 中:a[I]出[283]: 数组([5, 6, 7, 8, 9])在 [286]: i, = np.where(a>4) # 注意 LHS 上的 ,在 [287] 中:我Out[287]: array([4, 5, 6, 7, 8], dtype=int32) # 不是元组在 [288] 中:a[i]出[288]: 数组([5, 6, 7, 8, 9])在 [289] 中:a[(i,)]出[289]: 数组([5, 6, 7, 8, 9])

======================

np.flatnonzero 显示了只返回一个数组的正确方法,而不管输入数组的维数.

在[299]中:np.flatnonzero(a>4)出[299]: 数组([4, 5, 6, 7, 8], dtype=int32)在 [300] 中:np.flatnonzero(a>4)+10出[300]: 数组([14, 15, 16, 17, 18], dtype=int32)

医生说:

<块引用>

这相当于a.ravel().nonzero()[0]

事实上,这就是函数所做的.

通过展平 a 消除了如何处理多个维度的问题.然后它从元组中取出响应,为您提供一个简单的数组.通过展平,它不会对一维数组产生特殊情况.

============================

@Divakar 建议 np.argwhere:

在[303]: np.argwhere(a>4)出[303]:数组([[4],[5],[6],[7],[8]], dtype=int32)

np.transpose(np.where(a>4))

或者如果你不喜欢列向量,你可以再次转置它

在[307]中:np.argwhere(a>4).T出[307]: 数组([[4, 5, 6, 7, 8]], dtype=int32)

除了现在它是一个 1xn 数组.

我们也可以将 where 包裹在 array 中:

在[311]: np.array(np.where(a>4))出[311]: 数组([[4, 5, 6, 7, 8]], dtype=int32)

将数组从 where 元组 ([0], i,=, transposearray 等).

I am experimenting with the numpy.where(condition[, x, y]) function.
From the numpy documentation, I learn that if you give just one array as input, it should return the indices where the array is non-zero (i.e. "True"):

If only condition is given, return the tuple condition.nonzero(), the indices where condition is True.

But if try it, it returns me a tuple of two elements, where the first is the wanted list of indices, and the second is a null element:

>>> import numpy as np
>>> array = np.array([1,2,3,4,5,6,7,8,9])
>>> np.where(array>4)
(array([4, 5, 6, 7, 8]),) # notice the comma before the last parenthesis

so the question is: why? what is the purpose of this behaviour? in what situation this is useful? Indeed, to get the wanted list of indices I have to add the indexing, as in np.where(array>4)[0], which seems... "ugly".


ADDENDUM

I understand (from some answers) that it is actually a tuple of just one element. Still I don't understand why to give the output in this way. To illustrate how this is not ideal, consider the following error (which motivated my question in the first place):

>>> import numpy as np
>>> array = np.array([1,2,3,4,5,6,7,8,9])
>>> pippo = np.where(array>4)
>>> pippo + 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: can only concatenate tuple (not "int") to tuple

so that you need to do some indexing to access the actual array of indices:

>>> pippo[0] + 1
array([5, 6, 7, 8, 9])

解决方案

In Python (1) means just 1. () can be freely added to group numbers and expressions for human readability (e.g. (1+3)*3 v (1+3,)*3). Thus to denote a 1 element tuple it uses (1,) (and requires you to use it as well).

Thus

(array([4, 5, 6, 7, 8]),)

is a one element tuple, that element being an array.

If you applied where to a 2d array, the result would be a 2 element tuple.

The result of where is such that it can be plugged directly into an indexing slot, e.g.

a[where(a>0)]
a[a>0]

should return the same things

as would

I,J = where(a>0)   # a is 2d
a[I,J]
a[(I,J)]

Or with your example:

In [278]: a=np.array([1,2,3,4,5,6,7,8,9])
In [279]: np.where(a>4)
Out[279]: (array([4, 5, 6, 7, 8], dtype=int32),)  # tuple

In [280]: a[np.where(a>4)]
Out[280]: array([5, 6, 7, 8, 9])

In [281]: I=np.where(a>4)
In [282]: I
Out[282]: (array([4, 5, 6, 7, 8], dtype=int32),)
In [283]: a[I]
Out[283]: array([5, 6, 7, 8, 9])

In [286]: i, = np.where(a>4)   # note the , on LHS
In [287]: i
Out[287]: array([4, 5, 6, 7, 8], dtype=int32)  # not tuple
In [288]: a[i]
Out[288]: array([5, 6, 7, 8, 9])
In [289]: a[(i,)]
Out[289]: array([5, 6, 7, 8, 9])

======================

np.flatnonzero shows the correct way of returning just one array, regardless of the dimensions of the input array.

In [299]: np.flatnonzero(a>4)
Out[299]: array([4, 5, 6, 7, 8], dtype=int32)
In [300]: np.flatnonzero(a>4)+10
Out[300]: array([14, 15, 16, 17, 18], dtype=int32)

It's doc says:

This is equivalent to a.ravel().nonzero()[0]

In fact that is literally what the function does.

By flattening a removes the question of what to do with multiple dimensions. And then it takes the response out of the tuple, giving you a plain array. With flattening it doesn't have make a special case for 1d arrays.

===========================

@Divakar suggests np.argwhere:

In [303]: np.argwhere(a>4)
Out[303]: 
array([[4],
       [5],
       [6],
       [7],
       [8]], dtype=int32)

which does np.transpose(np.where(a>4))

Or if you don't like the column vector, you could transpose it again

In [307]: np.argwhere(a>4).T
Out[307]: array([[4, 5, 6, 7, 8]], dtype=int32)

except now it is a 1xn array.

We could just as well have wrapped where in array:

In [311]: np.array(np.where(a>4))
Out[311]: array([[4, 5, 6, 7, 8]], dtype=int32)

Lots of ways of taking an array out the where tuple ([0], i,=, transpose, array, etc).

这篇关于numpy.where(condition) 的输出不是数组,而是数组元组:为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆