两个numpy数组中所有行的组合 [英] Combination of all rows in two numpy arrays

查看:127
本文介绍了两个numpy数组中所有行的组合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数组,例如,形状为(3,2),另一个数组为形状(10,7).我想要两个数组的所有组合,以便最终得到9列数组.换句话说,我想要第一个数组的每一行与第二个数组的行的所有组合.

I have two arrays, for example with shape (3,2) and the other with shape (10,7). I want all combinations of the two arrays such that I end up with a 9 column array. In other words, I want all combinations of each row of the first array with the rows of the second array.

我该怎么做?据我所知,我没有正确使用Meshgrid.

How can I do this? I am not using meshgrid correctly as far as I can tell.

根据以前的帖子,我的印象是

Based on previous posts, I was under the impression that

a1 = np.zeros((10,7))
a2 = np.zeros((3,2))
r = np.array(np.meshgrid(a1, a2)).T.reshape(-1, a1.shape[1] + a2.shape[1])

是可以的,但这使我的尺寸为(84,10).

would work, but that gives me dimensions of (84,10).

推荐答案

方法1

关注性能,这是使用array-initializationelement-broadcasting进行分配的一种方法-

Approach #1

With focus on performance here's one approach with array-initialization and element-broadcasting for assignments -

m1,n1 = a1.shape
m2,n2 = a2.shape
out = np.zeros((m1,m2,n1+n2),dtype=int)
out[:,:,:n1] = a1[:,None,:]
out[:,:,n1:] = a2
out.shape = (m1*m2,-1)

说明:

诀窍在于两个步骤:

Explanation :

The trick lies in the two steps :

out[:,:,:n1] = a1[:,None,:]
out[:,:,n1:] = a2

第1步:

In [227]: np.random.seed(0)

In [228]: a1 = np.random.randint(1,9,(3,2))

In [229]: a2 = np.random.randint(1,9,(2,7))

In [230]: m1,n1 = a1.shape
     ...: m2,n2 = a2.shape
     ...: out = np.zeros((m1,m2,n1+n2),dtype=int)
     ...: 

In [231]: out[:,:,:n1] = a1[:,None,:]

In [232]: out[:,:,:n1]
Out[232]: 
array([[[5, 8],
        [5, 8]],

       [[6, 1],
        [6, 1]],

       [[4, 4],
        [4, 4]]])

In [233]: a1[:,None,:]
Out[233]: 
array([[[5, 8]],

       [[6, 1]],

       [[4, 4]]])

因此,基本上,我们要分配a1的元素,以使第一轴与输出中的相应轴对齐,同时让沿输出数组第二轴的元素以广播方式填充与<沿该轴为a1添加了c5>.这是这里的症结所在,因为我们没有分配额外的内存空间,因此带来了性能,否则我们将需要使用显式的重复/平铺方法来获得额外的内存空间.

So, basically we are assigning the elements of a1 keeping the first axis aligned with the corresponding one of the output, while letting the elements along the second axis of the output array being filled in a broadcasted manner corresponding to the newaxis being added for a1 along that axis. This is the crux here and brings about performance because we are not allocating extra memory space, which we would need otherwise with explicit repeating/tiling methods.

第2步:

In [237]: out[:,:,n1:] = a2

In [238]: out[:,:,n1:]
Out[238]: 
array([[[4, 8, 2, 4, 6, 3, 5],
        [8, 7, 1, 1, 5, 3, 2]],

       [[4, 8, 2, 4, 6, 3, 5],
        [8, 7, 1, 1, 5, 3, 2]],

       [[4, 8, 2, 4, 6, 3, 5],
        [8, 7, 1, 1, 5, 3, 2]]])

In [239]: a2
Out[239]: 
array([[4, 8, 2, 4, 6, 3, 5],
       [8, 7, 1, 1, 5, 3, 2]])

在这里,我们基本上是在输出数组的第一个轴上广播该 block a2,而没有明确地进行重复复制.

Here, we are basically broadcasting that block a2 along the first axis of the output array without explicitly making repeated copies.

样本输入,输出的完整性-

Sample input, output for completeness -

In [242]: a1
Out[242]: 
array([[5, 8],
       [6, 1],
       [4, 4]])

In [243]: a2
Out[243]: 
array([[4, 8, 2, 4, 6, 3, 5],
       [8, 7, 1, 1, 5, 3, 2]])

In [244]: out
Out[244]: 
array([[[5, 8, 4, 8, 2, 4, 6, 3, 5],
        [5, 8, 8, 7, 1, 1, 5, 3, 2]],

       [[6, 1, 4, 8, 2, 4, 6, 3, 5],
        [6, 1, 8, 7, 1, 1, 5, 3, 2]],

       [[4, 4, 4, 8, 2, 4, 6, 3, 5],
        [4, 4, 8, 7, 1, 1, 5, 3, 2]]])

方法2

另一个与tiling/repeating-

parte1 = np.repeat(a1[:,None,:],m2,axis=0).reshape(-1,m2)
parte2 = np.repeat(a2[None],m1,axis=0).reshape(-1,n2)
out = np.c_[parte1, parte2] 

这篇关于两个numpy数组中所有行的组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆