用于比较两个NumPy数组的元素广播? [英] Element-wise broadcasting for comparing two NumPy arrays?

查看:159
本文介绍了用于比较两个NumPy数组的元素广播?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个像这样的数组:

Let's say I have an array like this:

import numpy as np

base_array = np.array([-13, -9, -11, -3, -3, -4,   2,  2,
                         2,  5,   7,  7,  8,  7,  12, 11])

假设我想知道:"base_array中有多少个元素大于4?"这可以通过利用广播来简单地完成:

Suppose I want to know: "how many elements in base_array are greater than 4?" This can be done simply by exploiting broadcasting:

np.sum(4 < base_array)

答案为7.现在,假设我不想在一个数组上进行比较,而不是与单个值进行比较.换句话说,对于comparison_array中的每个值c,找出base_array的元素数大于c.如果我天真地这样做,它显然会失败,因为它不知道如何正确广播它:

For which the answer is 7. Now, suppose instead of comparing to a single value, I want to do this over an array. In other words, for each value c in the comparison_array, find out how many elements of base_array are greater than c. If I do this the naive way, it obviously fails because it doesn't know how to broadcast it properly:

comparison_array = np.arange(-13, 13)
comparison_result = np.sum(comparison_array < base_array)

输出:

Traceback (most recent call last):
  File "<pyshell#87>", line 1, in <module>
    np.sum(comparison_array < base_array)
ValueError: operands could not be broadcast together with shapes (26,) (16,) 

如果我能以某种方式让comparison_array的每个元素广播到base_array的形状,那将解决此问题.但是我不知道如何进行这种按元素广播".

If I could somehow have each element of comparison_array get broadcast to base_array's shape, that would solve this. But I don't know how to do such an "element-wise broadcasting".

现在,我知道我如何使用列表理解来实现这两种情况:

Now, I do know I how to implement this for both cases using list comprehension:

first = sum([4 < i for i in base_array])
second = [sum([c < i for i in base_array])
          for c in comparison_array]
print(first)
print(second)

输出:

7
[15, 15, 14, 14, 13, 13, 13, 13, 13, 12, 10, 10, 10, 10, 10, 7, 7, 7, 6, 6, 3, 2, 2, 2, 1, 0]

但是,众所周知,这将比在大型数组上正确矢量化的numpy实现慢几个数量级.因此,我应该如何在numpy中做到这一点,以便快速?理想情况下,此解决方案应扩展到广播工作的任何类型的操作,而不仅仅是在此示例中大于或小于该操作.

But as we all know, this will be orders of magnitude slower than a correctly-vectorized numpy implementation on larger arrays. So, how should I do this in numpy so that it's fast? Ideally this solution should extend to any kind of operation where broadcasting works, not just greater-than or less-than in this example.

推荐答案

您可以简单地向比较数组添加一个维度,以便在新维度的所有值之间拉伸"比较.

You can simply add a dimension to the comparison array, so that the comparison is "stretched" across all values along the new dimension.

>>> np.sum(comparison_array[:, None] < base_array)
228

这是广播的基本原理,并且适用于各种操作.

This is the fundamental principle with broadcasting, and works for all kinds of operations.

如果需要沿某个轴求和,只需在比较后指定要沿其求和的轴即可.

If you need the sum done along an axis, you just specify the axis along which you want to sum after the comparison.

>>> np.sum(comparison_array[:, None] < base_array, axis=1)
array([15, 15, 14, 14, 13, 13, 13, 13, 13, 12, 10, 10, 10, 10, 10,  7,  7,
        7,  6,  6,  3,  2,  2,  2,  1,  0])

这篇关于用于比较两个NumPy数组的元素广播?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆