稀疏矩阵的划分 [英] Division of sparse matrix

查看:41
本文介绍了稀疏矩阵的划分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 45671x45671 个元素的 scipy.sparse 矩阵.在此矩阵中,某些行仅包含0"值.

I have a scipy.sparse matrix with 45671x45671 elements. In this matrix, some rows contain only '0' value.

我的问题是,如何将每行值除以行总和.显然,使用 for 循环是可行的,但我正在寻找一种有效的方法...

My question is, how to divide each row values by the row sum. Obviously, with for loop it's work, but I look for an efficient method...

我已经试过了:

  • matrix/matrix.sum(1) 但我有 MemoryError 问题.
  • matrix/scs.csc_matrix((matrix.sum(axis=1)))ValueError:不一致的形状
  • 其他古怪的事情...
  • matrix / matrix.sum(1) but I have MemoryError issue.
  • matrix / scs.csc_matrix((matrix.sum(axis=1))) but ValueError: inconsistent shapes
  • Other wacky things...

此外,我想跳过只有0"值的行.

Moreover, I want to skip rows with only '0' values.

所以,如果您有任何解决方案...

So, if you have any solution...

先谢谢你!

推荐答案

我有一个 M 闲逛:

In [241]: M
Out[241]: 
<6x3 sparse matrix of type '<class 'numpy.uint8'>'
    with 6 stored elements in Compressed Sparse Row format>
In [242]: M.A
Out[242]: 
array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1],
       [0, 1, 0],
       [0, 0, 1],
       [1, 0, 0]], dtype=uint8)
In [243]: M.sum(1)            # dense matrix
Out[243]: 
matrix([[1],
        [1],
        [1],
        [1],
        [1],
        [1]], dtype=uint32)
In [244]: M/M.sum(1)      # dense matrix - full size of M
Out[244]: 
matrix([[ 1.,  0.,  0.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.],
        [ 0.,  1.,  0.],
        [ 0.,  0.,  1.],
        [ 1.,  0.,  0.]])

这将解释内存错误 - 如果 M 太大以至于 M.A 产生内存错误.

That will explain the memory error - if M is so large that M.A produces a memory error.

In [262]: S = sparse.csr_matrix(M.sum(1))
In [263]: S.shape
Out[263]: (6, 1)
In [264]: M.shape
Out[264]: (6, 3)
In [265]: M/S
....
ValueError: inconsistent shapes

我不完全确定这里发生了什么.

I'm not entirely sure what is going on here.

元素明智的乘法工作

In [266]: M.multiply(S)
Out[266]: 
<6x3 sparse matrix of type '<class 'numpy.uint32'>'
    with 6 stored elements in Compressed Sparse Row format>

所以如果我将 S 构造为 S = sparse.csr_matrix(1/M.sum(1))

So it should work if I construct S as S = sparse.csr_matrix(1/M.sum(1))

如果某些行的总和为零,则存在除以零的问题.

If some of the rows sum to zero, you have a division by zero problem.

如果我修改 M 有 0 行

If I modify M to have 0 row

In [283]: M.A
Out[283]: 
array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 0],
       [0, 1, 0],
       [0, 0, 1],
       [1, 0, 0]], dtype=uint8)
In [284]: S = sparse.csr_matrix(1/M.sum(1))
/usr/local/bin/ipython3:1: RuntimeWarning: divide by zero encountered in true_divide
  #!/usr/bin/python3
In [285]: S.A
Out[285]: 
array([[  1.],
       [  1.],
       [ inf],
       [  1.],
       [  1.],
       [  1.]])
In [286]: M.multiply(S)
Out[286]: 
<6x3 sparse matrix of type '<class 'numpy.float64'>'
    with 5 stored elements in Compressed Sparse Row format>
In [287]: _.A
Out[287]: 
array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.],
       [ 1.,  0.,  0.]])

这不是最好的 M 来证明这一点,但它提出了一种有用的方法.行和将是密集的,因此您可以使用通常的密集数组方法清理其逆.

This isn't the best M to demonstrate this on, but it suggests a useful approach. The row sum will be dense, so you can clean up its inverse using the usual dense array approaches.

这篇关于稀疏矩阵的划分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆