python/scipy/numpy中的有效增量式稀疏矩阵 [英] Efficient incremental sparse matrix in python / scipy / numpy
问题描述
Python中是否有一种方法可以对稀疏矩阵进行有效的增量更新?
Is there a way in Python to have an efficient incremental update of sparse matrix?
H = lil_matrix((n,m))
for (i,j) in zip(A,B):
h(i,j) += compute_something
这种构建稀疏矩阵的方法似乎很慢(lil_matrix
是最快的稀疏矩阵类型).
It seems that such a way to build a sparse matrix is quite slow (lil_matrix
is the fastest sparse matrix type for that).
有没有一种方法(例如使用dict或其他方法)来有效地构建稀疏矩阵H?
Is there a way (like using dict of dict or other kind of approaches) to efficiently build the sparse matrix H?
推荐答案
在 https://stackoverflow.com/a/27771335/901925 我探索了增量矩阵分配.
In https://stackoverflow.com/a/27771335/901925 I explore incremental matrix assignment.
lol
和dok
格式. csr
会向您发出效率警告,并且coo
不允许编制索引.
lol
and dok
are the recommended formats if you want to change values. csr
will give you an efficiency warning, and coo
does not allow indexing.
但是我也发现dok
索引比常规字典索引慢.因此,对于许多更改,最好构建一个普通的字典(具有相同的元组索引),然后从中构建dok
矩阵.
But I also found that dok
indexing is slow compared to regular dictionary indexing. So for many changes it is better to build a plain dictionary (with the same tuple indexing), and build the dok
matrix from that.
但是,如果您可以使用快速的numpy
向量运算来计算H
数据值,而不是进行迭代,则最好这样做,并从中构造稀疏矩阵(例如coo
格式).实际上,即使进行迭代,它也会更快:
But if you can calculate the H
data values with a fast numpy
vector operation, as opposed to iteration, it is best to do so, and construct the sparse matrix from that (e.g. coo
format). In fact even with iteration this would be faster:
h = np.zeros(A.shape)
for k, (i,j) in enumerate(zip(A,B)):
h[k] = compute_something
H = sparse.coo_matrix((h, (A, B)), shape=(n,m))
例如
In [780]: A=np.array([0,1,1,2]); B=np.array([0,2,2,1])
In [781]: h=np.zeros(A.shape)
In [782]: for k, (i,j) in enumerate(zip(A,B)):
h[k] = i+j+k
.....:
In [783]: h
Out[783]: array([ 0., 4., 5., 6.])
In [784]: M=sparse.coo_matrix((h,(A,B)),shape=(4,4))
In [785]: M
Out[785]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in COOrdinate format>
In [786]: M.A
Out[786]:
array([[ 0., 0., 0., 0.],
[ 0., 0., 9., 0.],
[ 0., 6., 0., 0.],
[ 0., 0., 0., 0.]])
请注意,(1,2)值为4 + 5之和.这是coo
到csr
转换的一部分.
Note that the (1,2) value is the sum 4+5. That's part of the coo
to csr
conversion.
在这种情况下,我本可以使用以下公式计算h
:
In this case I could have calculated h
with:
In [791]: A+B+np.arange(A.shape[0])
Out[791]: array([0, 4, 5, 6])
因此不需要迭代.
这篇关于python/scipy/numpy中的有效增量式稀疏矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!