numpy广播如何更快地执行? [英] How does numpy broadcasting perform faster?

查看:114
本文介绍了numpy广播如何更快地执行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在以下问题中, https://stackoverflow.com/a/40056135/5714445

Numpy的广播提供了比将np.setdiff1d()与np.view()配对使用快近6倍的解决方案.如何做到这一点?

Numpy's broadcasting provides a solution that's almost 6x faster than using np.setdiff1d() paired with np.view(). How does it manage to do this?

使用A[~((A[:,None,:] == B).all(-1)).any(1)]可以使速度更快. 有趣,但是又提出了另一个问题.效果如何甚至更好?

And using A[~((A[:,None,:] == B).all(-1)).any(1)] speeds it up even more. Interesting, but raises yet another question. How does this perform even better?

推荐答案

我会尝试回答问题的第二部分.

I would try to answer the second part of the question.

因此,我们正在与之进行比较:

So, with it we are comparing :

A[np.all(np.any((A-B[:, None]), axis=2), axis=0)]  (I)

A[~((A[:,None,:] == B).all(-1)).any(1)]

要与第一种观点进行比较,我们可以写下第二种方法-

To compare with a matching perspective against the first one, we could write down the second approach like this -

A[(((~(A[:,None,:] == B)).any(2))).all(1)]         (II)

考虑性能时的主要区别是,对于第一个,我们得到的是减法不匹配,然后使用.any()检查非零.因此,使any()可以在非布尔dtype数组的数组上进行操作.在第二种方法中,取而代之的是给它提供一个用A[:,None,:] == B获得的布尔数组.

The major difference when considering performance, would be the fact that with the first one, we are getting non-matches with subtraction and then checking for non-zeros with .any(). Thus, any() is made to operate on an array of non-boolean dtype array. In the second approach, instead we are feeding it a boolean array obtained with A[:,None,:] == B.

让我们做一个小的运行时测试,看看.any()int dtype和boolean array上的表现如何-

Let's do a small runtime test to see how .any() performs on int dtype vs boolean array -

In [141]: A = np.random.randint(0,9,(1000,1000)) # An int array

In [142]: %timeit A.any(0)
1000 loops, best of 3: 1.43 ms per loop

In [143]: A = np.random.randint(0,9,(1000,1000))>5 # A boolean array

In [144]: %timeit A.any(0)
10000 loops, best of 3: 164 µs per loop

因此,在这部分速度接近 9x 的情况下,我们看到将any()与布尔数组结合使用具有巨大优势.我认为这是使第二种方法更快的最大原因.

So, with close to 9x speedup on this part, we see a huge advantage to use any() with boolean arrays. This I think was the biggest reason to make the second approach faster.

这篇关于numpy广播如何更快地执行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆