为什么scipy的稀疏csr_matrix的矢量点积比numpy的密集数组要慢? [英] Why is vector dot product slower with scipy's sparse csr_matrix than numpy's dense array?

查看:125
本文介绍了为什么scipy的稀疏csr_matrix的矢量点积比numpy的密集数组要慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到的情况是,我需要从稀疏矩阵中提取一行,然后将其点积与密集行合并.使用scipy的csr_matrix,这似乎比使用numpy的密集数组乘法慢得多.这让我感到惊讶,因为我期望稀疏点积将涉及更少的操作.这是一个示例:

I have a situation in which I need to extract a single row from a sparse matrix and take its dot product with a dense row. Using scipy's csr_matrix, this appears to be significantly slower than using numpy's dense array multiplication. This is surprising to me because I expected that sparse dot product would involve significantly fewer operations. Here is an example:

import timeit as ti

sparse_setup = 'import numpy as np; import scipy.sparse as si;' + \
               'u = si.eye(10000).tocsr()[10];' + \
               'v = np.random.randint(100, size=10000)'

dense_setup  = 'import numpy as np; u = np.eye(10000)[10];' + \
               'v = np.random.randint(100, size=10000)'

ti.timeit('u.dot(v)', setup=sparse_setup, number=100000)
2.788649031019304

ti.timeit('u.dot(v)', setup=dense_setup, number=100000)
2.179030169005273

对于矩阵矢量乘法,稀疏表示胜出,但在这种情况下不行.我尝试使用csc_matrix,但性能甚至更差:

For matrix-vector multiplication, the sparse representation wins hands down, but not in this case. I tried with csc_matrix, but performance is even worse:

>>> sparse_setup = 'import numpy as np; import scipy.sparse as si;' + \
...                'u = si.eye(10000).tocsc()[10];' + \
...                'v = np.random.randint(100, size=10000)'
>>> ti.timeit('u.dot(v)', setup=sparse_setup, number=100000)
7.0045155879925005

在这种情况下,为什么numpy击败了scipy.sparse?有没有一种矩阵格式可以更快地进行此类计算?

Why does numpy beat scipy.sparse in this case? Is there a matrix format that's faster for these kind of computations?

推荐答案

CSR/CSC矢量产品调用每次执行都需要花费几微秒的时间,这是因为执行少量的Python代码以及处理编译后的代码中的参数(scipy .sparse._sparsetools.csr_matvec).

The CSR/CSC vector product call has a few microsecond overhead per call, from executing a tiny bit of Python code, and dealing with arguments in compiled code (scipy.sparse._sparsetools.csr_matvec).

在现代处理器上,矢量点积的计算非常快,因此在这种情况下,开销支配了计算时间.矩阵向量产品本身更昂贵,在这里看不到类似的开销.

On modern processors, computing vector dot products is very fast, so the overheads dominate the computing time in this case. Matrix-vector products itself is more expensive, and here a similar overhead is not visible.

为什么Numpy的间接费用会更小?这主要是由于对代码进行了更好的优化.可以在此处改善csr_matrix的性能.

Why are then the overheads smaller for Numpy? This is mainly just due to better optimization of the code; the performance of csr_matrix can likely be improved here.

这篇关于为什么scipy的稀疏csr_matrix的矢量点积比numpy的密集数组要慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆