使用Python,NumPy,SciPy使用矩阵乘法对矩阵进行高效切片 [英] Efficient slicing of matrices using matrix multiplication, with Python, NumPy, SciPy
问题描述
我想将2d scipy.sparse.csr.csr_matrix
(我们称其为A
)重塑为2d numpy.ndarray
(我们称其为B
).
I want to reshape a 2d scipy.sparse.csr.csr_matrix
(let us call it A
) to a 2d numpy.ndarray
(let us call this B
).
A
可能是
>shape(A)
(90, 10)
然后
B
应该是
>shape(B)
(9,10)
其中,A
的每10行将以新的新值(即此窗口和列的最大值)进行重塑.列运算符无法处理这种不可散列的稀疏矩阵类型.如何使用矩阵乘法获得B
?
where each 10 rows of A
would be reshaped in a new new value, namely the maximum of this window and column. The column operator is not working on this unhashable type of a sparse matrix. How can I get this B
by using matrix multiplications?
推荐答案
使用矩阵乘法,您可以有效地切片,在正确的位置创建带有切片器"矩阵的切片器"矩阵.切片的矩阵将与切片器"具有相同的type
,因此您可以有效地控制输出类型.
Using matrix multiplication you can do en efficient slicing creating a "slicer" matrix with ones at the right places. The sliced matrix will have the same type
as the "slicer", so you can control in an efficient way your output type.
下面您将看到一些比较,最适合您的情况是要求提供.A
矩阵并将其切成薄片.它显示出比.toarray()
方法快得多.将切片器"创建为ndarray
并乘以csr
矩阵并对结果进行切片时,使用乘法是第二快的选择.
Below you will see some comparisons and the most efficient for you case is to ask for the .A
matrix and slice it. It showed to be much faster than the .toarray()
method. Using multiplication is the second fastest option when the "slicer" is created as a ndarray
, multiplied by the csr
matrix and slice the result .
OBS:对矩阵A
使用coo
稀疏会导致时序稍慢,保持相同的比例,并且sol3
不适用,后来我意识到在乘法运算中将其转换为
OBS: using a coo
sparse for matrix A
resulted in a slightly slower timing, keeping the same proportions, and sol3
is not applicable, I realized later that in the multiplication it is converted to a csr
automatically.
import scipy
import scipy.sparse.csr as csr
test = csr.csr_matrix([
[11,12,13,14,15,16,17,18,19],
[21,22,23,24,25,26,27,28,29],
[31,32,33,34,35,36,37,38,39],
[41,42,43,44,45,46,47,48,49],
[51,52,53,54,55,56,57,58,59],
[61,62,63,64,65,66,67,68,69],
[71,72,73,74,75,76,77,78,79],
[81,82,83,84,85,86,88,88,89],
[91,92,93,94,95,96,99,98,99]])
def sol1():
B = test.A[2:5]
def sol2():
slicer = scipy.array([[0,0,0,0,0,0,0,0,0],
[0,0,0,0,0,0,0,0,0],
[0,0,1,0,0,0,0,0,0],
[0,0,0,1,0,0,0,0,0],
[0,0,0,0,1,0,0,0,0]])
B = (slicer*test)[2:]
return B
def sol3():
B = (test[2:5]).A
return B
def sol4():
slicer = csr.csr_matrix( ((1,1,1),((2,3,4),(2,3,4))), shape=(5,9) )
B = ((slicer*test).A)[2:] # just changing when we do the slicing
return B
def sol5():
slicer = csr.csr_matrix( ((1,1,1),((2,3,4),(2,3,4))), shape=(5,9) )
B = ((slicer*test)[2:]).A
return B
timeit sol1()
#10000 loops, best of 3: 60.4 us per loop
timeit sol2()
#10000 loops, best of 3: 91.4 us per loop
timeit sol3()
#10000 loops, best of 3: 111 us per loop
timeit sol4()
#1000 loops, best of 3: 310 us per loop
timeit sol5()
#1000 loops, best of 3: 363 us per loop
答案已经更新,用.A
替换了.toarray()
,给出了更快的结果,现在最好的解决方案排在了最前头
the answer has been updated replacing .toarray()
by .A
, giving much faster results and now the best solutions are placed on the top
这篇关于使用Python,NumPy,SciPy使用矩阵乘法对矩阵进行高效切片的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!