根据条件创建新的numpy数组 [英] Creating new numpy arrays based on condition

查看:211
本文介绍了根据条件创建新的numpy数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2个numpy数组:

I have 2 numpy arrays:

aa = np.random.rand(5,5)
bb = np.random.rand(5,5)

当aa和bb都超过0.5时,如何创建一个值为1的新数组?

How can I create a new array which has a value of 1 when both aa and bb exceed 0.5?

推荐答案

关注性能并使用两种方法,可以添加一些方法.一种方法是获取有效数组的布尔数组,并使用 np.where 根据相同的布尔数组在01之间选择.因此,从本质上讲,我们将有两种方法,一种利用有效的数据类型转换,另一种利用选择标准.现在,可以通过两种方式获得布尔数组-一种使用简单比较,另一种使用 np.logical_and .因此,通过两种方法来获取布尔数组,并通过两种方法将布尔数组转换为int数组,我们最终将得到以下四种实现-

With focus on performance and using two methods few aproaches could be added. One method would be to get the boolean array of valid ones and converting to int datatype with .astype() method. Another way could involve using np.where that lets us select between 0 and 1 based on the same boolean array. Thus, essentially we would have two methods, one that harnesses efficient datatype conversion and another that uses selection criteria. Now, the boolean array could be obtained in two ways - One using simple comparison and another using np.logical_and. So, with two ways to get the boolean array and two methods to convert the boolean array to int array, we would end up with four implementations as listed below -

out1 = ((aa>0.5) & (bb>0.5)).astype(int)
out2 = np.logical_and(aa>0.5, bb>0.5).astype(int)
out3 = np.where((aa>0.5) & (bb>0.5),1,0)
out4 = np.where(np.logical_and(aa>0.5, bb>0.5), 1, 0)

您可以尝试使用数据类型以使用精度较低的类型,因为我们无论如何都将值设置为01,这应该不会受到损害.好处应该是明显的加速,因为它利用了内存效率.我们可以使用 int8uint8np.int8类型.因此,使用新的int数据类型的较早列出的方法的变体为-

You can play around with the datatypes to use less precision types, which shouldn't hurt as we are setting the values to 0 and 1 anyway. The benefit should be noticeable speedup as it leverages memory efficiency. We could use int8, uint8, np.int8, np.uint8 types. Thus, the variants of the earlier listed approaches using the new int datatypes would be -

out5 = ((aa>0.5) & (bb>0.5)).astype('int8')
out6 = np.logical_and(aa>0.5, bb>0.5).astype('int8')
out7 = ((aa>0.5) & (bb>0.5)).astype('uint8')
out8 = np.logical_and(aa>0.5, bb>0.5).astype('uint8')

out9 = ((aa>0.5) & (bb>0.5)).astype(np.int8)
out10 = np.logical_and(aa>0.5, bb>0.5).astype(np.int8)
out11 = ((aa>0.5) & (bb>0.5)).astype(np.uint8)
out12 = np.logical_and(aa>0.5, bb>0.5).astype(np.uint8)

运行时测试(因为本文着重于性能)-

Runtime test (as we are focusing on performance with this post) -

In [17]: # Input arrays
    ...: aa = np.random.rand(1000,1000)
    ...: bb = np.random.rand(1000,1000)
    ...: 

In [18]: %timeit ((aa>0.5) & (bb>0.5)).astype(int)
    ...: %timeit np.logical_and(aa>0.5, bb>0.5).astype(int)
    ...: %timeit np.where((aa>0.5) & (bb>0.5),1,0)
    ...: %timeit np.where(np.logical_and(aa>0.5, bb>0.5), 1, 0)
    ...: 
100 loops, best of 3: 9.13 ms per loop
100 loops, best of 3: 9.16 ms per loop
100 loops, best of 3: 10.4 ms per loop
100 loops, best of 3: 10.4 ms per loop

In [19]: %timeit ((aa>0.5) & (bb>0.5)).astype('int8')
    ...: %timeit np.logical_and(aa>0.5, bb>0.5).astype('int8')
    ...: %timeit ((aa>0.5) & (bb>0.5)).astype('uint8')
    ...: %timeit np.logical_and(aa>0.5, bb>0.5).astype('uint8')
    ...: 
    ...: %timeit ((aa>0.5) & (bb>0.5)).astype(np.int8)
    ...: %timeit np.logical_and(aa>0.5, bb>0.5).astype(np.int8)
    ...: %timeit ((aa>0.5) & (bb>0.5)).astype(np.uint8)
    ...: %timeit np.logical_and(aa>0.5, bb>0.5).astype(np.uint8)
    ...: 
100 loops, best of 3: 5.6 ms per loop
100 loops, best of 3: 5.61 ms per loop
100 loops, best of 3: 5.63 ms per loop
100 loops, best of 3: 5.63 ms per loop
100 loops, best of 3: 5.62 ms per loop
100 loops, best of 3: 5.62 ms per loop
100 loops, best of 3: 5.62 ms per loop
100 loops, best of 3: 5.61 ms per loop

In [20]: %timeit 1 * ((aa > 0.5) & (bb > 0.5)) #@BPL's vectorized soln
100 loops, best of 3: 10.2 ms per loop

这篇关于根据条件创建新的numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆