使用Python,NumPy,SciPy使用矩阵乘法对矩阵进行高效切片 [英] Efficient slicing of matrices using matrix multiplication, with Python, NumPy, SciPy

查看:457
本文介绍了使用Python,NumPy,SciPy使用矩阵乘法对矩阵进行高效切片的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将2d scipy.sparse.csr.csr_matrix(我们称其为A)重塑为2d numpy.ndarray(我们称其为B).

I want to reshape a 2d scipy.sparse.csr.csr_matrix(let us call it A) to a 2d numpy.ndarray (let us call this B).

A可能是

>shape(A)
(90, 10)

然后 B应该是

>shape(B)
(9,10)

其中,A的每10行将以新的新值(即此窗口和列的最大值)进行重塑.列运算符无法处理这种不可散列的稀疏矩阵类型.如何使用矩阵乘法获得B?

where each 10 rows of A would be reshaped in a new new value, namely the maximum of this window and column. The column operator is not working on this unhashable type of a sparse matrix. How can I get this B by using matrix multiplications?

推荐答案

使用矩阵乘法,您可以有效地切片,在正确的位置创建带有切片器"矩阵的切片器"矩阵.切片的矩阵将与切片器"具有相同的type,因此您可以有效地控制输出类型.

Using matrix multiplication you can do en efficient slicing creating a "slicer" matrix with ones at the right places. The sliced matrix will have the same type as the "slicer", so you can control in an efficient way your output type.

下面您将看到一些比较,最适合您的情况是要求提供.A矩阵并将其切成薄片.它显示出比.toarray()方法快得多.将切片器"创建为ndarray并乘以csr矩阵并对结果进行切片时,使用乘法是第二快的选择.

Below you will see some comparisons and the most efficient for you case is to ask for the .A matrix and slice it. It showed to be much faster than the .toarray() method. Using multiplication is the second fastest option when the "slicer" is created as a ndarray, multiplied by the csr matrix and slice the result .

OBS:对矩阵A使用coo稀疏会导致时序稍慢,保持相同的比例,并且sol3不适用,后来我意识到在乘法运算中将其转换为自动.

OBS: using a coo sparse for matrix A resulted in a slightly slower timing, keeping the same proportions, and sol3 is not applicable, I realized later that in the multiplication it is converted to a csr automatically.

import scipy
import scipy.sparse.csr as csr
test = csr.csr_matrix([
[11,12,13,14,15,16,17,18,19],
[21,22,23,24,25,26,27,28,29],
[31,32,33,34,35,36,37,38,39],
[41,42,43,44,45,46,47,48,49],
[51,52,53,54,55,56,57,58,59],
[61,62,63,64,65,66,67,68,69],
[71,72,73,74,75,76,77,78,79],
[81,82,83,84,85,86,88,88,89],
[91,92,93,94,95,96,99,98,99]])

def sol1():
    B = test.A[2:5]

def sol2():
    slicer = scipy.array([[0,0,0,0,0,0,0,0,0],
                          [0,0,0,0,0,0,0,0,0],
                          [0,0,1,0,0,0,0,0,0],
                          [0,0,0,1,0,0,0,0,0],
                          [0,0,0,0,1,0,0,0,0]])
    B = (slicer*test)[2:]
    return B

def sol3():
    B = (test[2:5]).A
    return B

def sol4():
    slicer = csr.csr_matrix( ((1,1,1),((2,3,4),(2,3,4))), shape=(5,9) )
    B = ((slicer*test).A)[2:] # just changing when we do the slicing
    return B

def sol5():
    slicer = csr.csr_matrix( ((1,1,1),((2,3,4),(2,3,4))), shape=(5,9) )
    B = ((slicer*test)[2:]).A
    return B


timeit sol1()
#10000 loops, best of 3: 60.4 us per loop

timeit sol2()
#10000 loops, best of 3: 91.4 us per loop

timeit sol3()
#10000 loops, best of 3: 111 us per loop

timeit sol4()
#1000 loops, best of 3: 310 us per loop

timeit sol5()
#1000 loops, best of 3: 363 us per loop

答案已经更新,用.A替换了.toarray(),给出了更快的结果,现在最好的解决方案排在了最前头

the answer has been updated replacing .toarray() by .A, giving much faster results and now the best solutions are placed on the top

这篇关于使用Python,NumPy,SciPy使用矩阵乘法对矩阵进行高效切片的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆