特定元素的Numpy/Scipy广播计算标量积 [英] Numpy/Scipy broadcast calculating scalar product for a certain elements

查看:134
本文介绍了特定元素的Numpy/Scipy广播计算标量积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个像A那样的稀疏矩阵

I've a sparse matrix like A

和一个带有应用于计算标量积的行的数据框(df).

and a dataframe(df) with rows that should be taken to calculate scalar product.

Row1 Row2  Value
2    147   scalar product of vectors at Row1 and Raw2 in matrix A

我可以广播方式播放而不循环播放吗?

Can I do it in broadcasting manner without looping etc?

在我的情况下,A大小为1m * 100k,数据帧为10M

In my case A like 1m*100k size and the dataframe 10M

推荐答案

从一个小的稀疏"矩阵开始(csr最适合数学):

Start with a small 'sparse' matrix (csr is the best for math):

In [167]: A=sparse.csr_matrix([[1, 2, 3],  # Vadim's example
               [2, 1, 4],
               [0, 2, 2],
               [3, 0, 3]])

In [168]: AA=A.A    # dense equivalent  

In [169]: idx=np.array([[1,1,0,3],[3,0,0,2]]).T  # indexes

我会坚持使用numpy版本(Pandas建立在numpy之上)

I'll stick with the numpy version (Pandas is built on top of numpy)

我们可以获取所有行点积,并选择由idx定义的子集:

We could take all the row dot products, and select a subset defined by idx:

In [170]: (AA.dot(AA.T))[idx[:,0], idx[:,1]]
Out[170]: array([18, 16, 14,  6], dtype=int32)

稀疏矩阵乘积(A.dot(A.T)也适用:

Sparse matrix product (A.dot(A.T) also works:

In [171]: (A*A.T)[idx[:,0], idx[:,1]]
Out[171]: matrix([[18, 16, 14,  6]], dtype=int32)

或者我们可以先选择行,然后取乘积之和.我们不想在这里使用dot,因为我们并没有采用所有组合.

Or we can select the rows first, and then take the sum of products. We don't want to use dot here, since we aren't taking all combinations.

In [172]: (AA[idx[:,0]]*AA[idx[:,1]]).sum(axis=1)
Out[172]: array([18, 16, 14,  6], dtype=int32)

此计算的einsum版本:

In [180]: np.einsum('ij,ij->i',AA[idx[:,0]],AA[idx[:,1]])
Out[180]: array([18, 16, 14,  6], dtype=int32)

sparse可以执行相同的操作(*是矩阵乘积,.multiply是一个元素一个元素).

sparse can do the same (* is matrix product, .multiply is element by element).

In [173]: (A[idx[:,0]].multiply(A[idx[:,1]])).sum(axis=1)
Out[173]: 
matrix([[18],
        [16],
        [14],
        [ 6]], dtype=int32)

在这种情况下,密集版本的速度更快.稀疏的行索​​引很慢.

With this small case, the dense versions are faster. Sparse row indexing is slow.

In [181]: timeit np.einsum('ij,ij->i', AA[idx[:,0]], AA[idx[:,1]])
100000 loops, best of 3: 18.1 µs per loop

In [182]: timeit (A[idx[:,0]].multiply(A[idx[:,1]])).sum(axis=1)
1000 loops, best of 3: 1.32 ms per loop

In [184]: timeit (AA.dot(AA.T))[idx[:,0], idx[:,1]]
100000 loops, best of 3: 9.62 µs per loop

In [185]: timeit (A*A.T)[idx[:,0], idx[:,1]]
1000 loops, best of 3: 689 µs per loop

我差点忘了-迭代版本:

I almost forgot - the iterative versions:

In [191]: timeit [AA[i].dot(AA[j]) for i,j in idx]
10000 loops, best of 3: 38.4 µs per loop

In [192]: timeit [A[i].multiply(A[j]).sum() for i,j in idx]
100 loops, best of 3: 2.58 ms per loop


lil格式矩阵的行索引速度更快


Row indexing of lil format matrix is faster

In [207]: Al=A.tolil()

In [208]: timeit A[idx[:,0]]
1000 loops, best of 3: 476 µs per loop

In [209]: timeit Al[idx[:,0]]
1000 loops, best of 3: 234 µs per loop

但是,当将其转换回csr进行乘法运算时,可能无法节省时间.

But by the time it's converted back to csr for multiplication it might not save time.

===============

===============

在其他最近的SO问题中,我讨论了索引稀疏行或列的更快方法.但是,在这些目标中,最终目标是对一组选定的行或列求和.为此,实际上使用矩阵乘积最快-矩阵为1s和0s.在这里应用这个想法有点棘手.

In other recent SO questions I've discussed faster ways of indexing sparse rows or columns. But in those the final goal was to sum over a selected set of rows or columns. For that it was actually fastest to use a matrix product - with a matrix of 1s and 0s. Applying that idea here is a little trickier.

查看csr.__getitem__索引功能,我发现它实际上是对矩阵乘积进行A[idx,:]索引.它创建一个具有以下功能的extractor矩阵:

Looking at the csr.__getitem__ indexing function, I find that it actually does the A[idx,:] indexing with a matrix product. It creates an extractor matrix with function like:

def extractor(indices,N):
    """Return a sparse matrix P so that P*self implements
    slicing of the form self[[1,2,3],:]
    """
    indptr = np.arange(len(indices)+1, dtype=int)
    data = np.ones(len(indices), dtype=int)
    shape = (len(indices),N)
    return sparse.csr_matrix((data,indices,indptr), shape=shape)

In [328]: %%timeit
   .....: A1=extractor(idx[:,0],4)*A
   .....: A2=extractor(idx[:,1],4)*A
   .....: (A1.multiply(A2)).sum(axis=1)
   .....: 
1000 loops, best of 3: 1.14 ms per loop

这次比A[idx[:,0],:](上面的In[182])产生的时间稍好-大概是因为它有点简化了动作.它应该以相同的方式缩放.

This time is slightly better than that produced with A[idx[:,0],:] (In[182] above) - presumably because it is streamlining the action a bit. It should scale in the same way.

之所以可行,是因为idx0是从[1,1,0,3]

This works because idx0 is a boolean matrix derived from [1,1,0,3]

In [330]: extractor(idx[:,0],4).A
Out[330]: 
array([[0, 1, 0, 0],
       [0, 1, 0, 0],
       [1, 0, 0, 0],
       [0, 0, 0, 1]])

In [296]: A[idx[:,0],:].A
Out[296]: 
array([[2, 1, 4],
       [2, 1, 4],
       [1, 2, 3],
       [3, 0, 3]], dtype=int32)

In [331]: (extractor(idx[:,0],4)*A).A
Out[331]: 
array([[2, 1, 4],
       [2, 1, 4],
       [1, 2, 3],
       [3, 0, 3]], dtype=int32)

================

================

总而言之,如果问题太大而无法直接使用密集数组,那么最好将其缩放到较大的稀疏情况是

In sum, if the problem is too big to use the dense array directly, then be best bet for scaling to a large sparse case is

(A[idx[:,0]].multiply(A[idx[:,1]])).sum(axis=1)

如果这仍然产生内存错误,则可能在idx数组(或数据帧)的组上进行迭代.

If this is still producing memory errors, then iterate, possibly over groups of the idx array (or dataframe).

这篇关于特定元素的Numpy/Scipy广播计算标量积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆