用于比较两个NumPy数组的元素广播? [英] Element-wise broadcasting for comparing two NumPy arrays?
问题描述
假设我有一个像这样的数组:
Let's say I have an array like this:
import numpy as np
base_array = np.array([-13, -9, -11, -3, -3, -4, 2, 2,
2, 5, 7, 7, 8, 7, 12, 11])
假设我想知道:"base_array
中有多少个元素大于4?"这可以通过利用广播来简单地完成:
Suppose I want to know: "how many elements in base_array
are greater than 4?" This can be done simply by exploiting broadcasting:
np.sum(4 < base_array)
答案为7
.现在,假设我不想在一个数组上进行比较,而不是与单个值进行比较.换句话说,对于comparison_array
中的每个值c
,找出base_array
的元素数大于c
.如果我天真地这样做,它显然会失败,因为它不知道如何正确广播它:
For which the answer is 7
. Now, suppose instead of comparing to a single value, I want to do this over an array. In other words, for each value c
in the comparison_array
, find out how many elements of base_array
are greater than c
. If I do this the naive way, it obviously fails because it doesn't know how to broadcast it properly:
comparison_array = np.arange(-13, 13)
comparison_result = np.sum(comparison_array < base_array)
输出:
Traceback (most recent call last):
File "<pyshell#87>", line 1, in <module>
np.sum(comparison_array < base_array)
ValueError: operands could not be broadcast together with shapes (26,) (16,)
如果我能以某种方式让comparison_array
的每个元素广播到base_array
的形状,那将解决此问题.但是我不知道如何进行这种按元素广播".
If I could somehow have each element of comparison_array
get broadcast to base_array
's shape, that would solve this. But I don't know how to do such an "element-wise broadcasting".
现在,我知道我如何使用列表理解来实现这两种情况:
Now, I do know I how to implement this for both cases using list comprehension:
first = sum([4 < i for i in base_array])
second = [sum([c < i for i in base_array])
for c in comparison_array]
print(first)
print(second)
输出:
7
[15, 15, 14, 14, 13, 13, 13, 13, 13, 12, 10, 10, 10, 10, 10, 7, 7, 7, 6, 6, 3, 2, 2, 2, 1, 0]
但是,众所周知,这将比在大型数组上正确矢量化的numpy
实现慢几个数量级.因此,我应该如何在numpy
中做到这一点,以便快速?理想情况下,此解决方案应扩展到广播工作的任何类型的操作,而不仅仅是在此示例中大于或小于该操作.
But as we all know, this will be orders of magnitude slower than a correctly-vectorized numpy
implementation on larger arrays. So, how should I do this in numpy
so that it's fast? Ideally this solution should extend to any kind of operation where broadcasting works, not just greater-than or less-than in this example.
推荐答案
您可以简单地向比较数组添加一个维度,以便在新维度的所有值之间拉伸"比较.
You can simply add a dimension to the comparison array, so that the comparison is "stretched" across all values along the new dimension.
>>> np.sum(comparison_array[:, None] < base_array)
228
这是广播的基本原理,并且适用于各种操作.
This is the fundamental principle with broadcasting, and works for all kinds of operations.
如果需要沿某个轴求和,只需在比较后指定要沿其求和的轴即可.
If you need the sum done along an axis, you just specify the axis along which you want to sum after the comparison.
>>> np.sum(comparison_array[:, None] < base_array, axis=1)
array([15, 15, 14, 14, 13, 13, 13, 13, 13, 12, 10, 10, 10, 10, 10, 7, 7,
7, 6, 6, 3, 2, 2, 2, 1, 0])
这篇关于用于比较两个NumPy数组的元素广播?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!