python/scipy/numpy中的有效增量式稀疏矩阵 [英] Efficient incremental sparse matrix in python / scipy / numpy

查看:112
本文介绍了python/scipy/numpy中的有效增量式稀疏矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Python中是否有一种方法可以对稀疏矩阵进行有效的增量更新?

Is there a way in Python to have an efficient incremental update of sparse matrix?

 H = lil_matrix((n,m))
 for (i,j) in zip(A,B):
   h(i,j) += compute_something  

这种构建稀疏矩阵的方法似乎很慢(lil_matrix是最快的稀疏矩阵类型).

It seems that such a way to build a sparse matrix is quite slow (lil_matrix is the fastest sparse matrix type for that).

有没有一种方法(例如使用dict或其他方法)来有效地构建稀疏矩阵H?

Is there a way (like using dict of dict or other kind of approaches) to efficiently build the sparse matrix H?

推荐答案

https://stackoverflow.com/a/27771335/901925 我探索了增量矩阵分配.

In https://stackoverflow.com/a/27771335/901925 I explore incremental matrix assignment.

loldok格式. csr会向您发出效率警告,并且coo不允许编制索引.

lol and dok are the recommended formats if you want to change values. csr will give you an efficiency warning, and coo does not allow indexing.

但是我也发现dok索引比常规字典索引慢.因此,对于许多更改,最好构建一个普通的字典(具有相同的元组索引),然后从中构建dok矩阵.

But I also found that dok indexing is slow compared to regular dictionary indexing. So for many changes it is better to build a plain dictionary (with the same tuple indexing), and build the dok matrix from that.

但是,如果您可以使用快速的numpy向量运算来计算H数据值,而不是进行迭代,则最好这样做,并从中构造稀疏矩阵(例如coo格式).实际上,即使进行迭代,它也会更快:

But if you can calculate the H data values with a fast numpy vector operation, as opposed to iteration, it is best to do so, and construct the sparse matrix from that (e.g. coo format). In fact even with iteration this would be faster:

 h = np.zeros(A.shape)
 for k, (i,j) in enumerate(zip(A,B)):
    h[k] = compute_something 
 H = sparse.coo_matrix((h, (A, B)), shape=(n,m))

例如

In [780]: A=np.array([0,1,1,2]); B=np.array([0,2,2,1])
In [781]: h=np.zeros(A.shape)
In [782]: for k, (i,j) in enumerate(zip(A,B)):
    h[k] = i+j+k
   .....:     
In [783]: h
Out[783]: array([ 0.,  4.,  5.,  6.])
In [784]: M=sparse.coo_matrix((h,(A,B)),shape=(4,4))
In [785]: M
Out[785]: 
<4x4 sparse matrix of type '<class 'numpy.float64'>'
    with 4 stored elements in COOrdinate format>
In [786]: M.A
Out[786]: 
array([[ 0.,  0.,  0.,  0.],
       [ 0.,  0.,  9.,  0.],
       [ 0.,  6.,  0.,  0.],
       [ 0.,  0.,  0.,  0.]])

请注意,(1,2)值为4 + 5之和.这是coocsr转换的一部分.

Note that the (1,2) value is the sum 4+5. That's part of the coo to csr conversion.

在这种情况下,我本可以使用以下公式计算h:

In this case I could have calculated h with:

In [791]: A+B+np.arange(A.shape[0])
Out[791]: array([0, 4, 5, 6])

因此不需要迭代.

这篇关于python/scipy/numpy中的有效增量式稀疏矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆