在稀疏矩阵中采用对数函数的有效方法 [英] Efficient way of taking Logarithm function in a sparse matrix
问题描述
我有一个大的稀疏矩阵.我想对该稀疏矩阵中的所有元素采用log4
.
I have a big sparse matrix. I want to take log4
for all element in that sparse matrix.
我尝试使用numpy.log()
,但不适用于矩阵.
I try to use numpy.log()
but it doesn't work with matrices.
我也可以逐行取对数.然后,我用新的一行粉碎旧的一行.
I can also take logarithm row by row. Then I crush old row with a new one.
# Assume A is a sparse matrix (Linked List Format) with float values as data
# It is only for one row
import numpy as np
c = np.log(A.getrow(0)) / numpy.log(4)
A[0, :] = c
这没有我预期的那么快.有没有更快的方法可以做到这一点?
This was not as quick as I'd expected. Is there a faster way to do this?
推荐答案
您可以直接修改data
属性:
>>> a = np.array([[5,0,0,0,0,0,0],[0,0,0,0,2,0,0]])
>>> coo = coo_matrix(a)
>>> coo.data
array([5, 2])
>>> coo.data = np.log(coo.data)
>>> coo.data
array([ 1.60943791, 0.69314718])
>>> coo.todense()
matrix([[ 1.60943791, 0. , 0. , 0. , 0. ,
0. , 0. ],
[ 0. , 0. , 0. , 0. , 0.69314718,
0. , 0. ]])
请注意,如果稀疏格式包含重复的元素(在COO格式中有效),则此方法将无法正常工作;它将分别获取日志和log(a) + log(b) != log(a + b)
.您可能要先转换为CSR或CSC(速度很快),以避免出现此问题.
Note that this doesn't work properly if the sparse format has repeated elements (which is valid in the COO format); it'll take the logs individually, and log(a) + log(b) != log(a + b)
. You probably want to convert to CSR or CSC first (which is fast) to avoid this problem.
当然,您还必须添加检查,如果稀疏矩阵的格式不同.而且,如果您不想就地修改矩阵,只需像在答案中一样构造一个新的稀疏矩阵,但无需添加3
,因为这里完全没有必要.
You'll also have to add checks if the sparse matrix is in a different format, of course. And if you don't want to modify the matrix in-place, just construct a new sparse matrix as you did in your answer, but without adding 3
because that's completely unnecessary here.
这篇关于在稀疏矩阵中采用对数函数的有效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!