可以/不能在压缩稀疏行(CSR)矩阵上使用的numpy函数 [英] numpy functions that can/cannot be used on a compressed sparse row (CSR) matrix

查看:150
本文介绍了可以/不能在压缩稀疏行(CSR)矩阵上使用的numpy函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是Python的新手,我有一个(可能很幼稚)的问题.我有一个要处理的CSR(压缩稀疏行)矩阵(将其命名为M),看起来有些为2d numpy数组操作设计的函数对我的矩阵起作用,而另一些函数则没有.

I am a newbie in Python and I have a (probably very naive) question. I have a CSR (compressed sparse row) matrix to work on (let's name it M), and looks like some functions that are designed for a 2d numpy array manipulation work for my matrix while some others do not.

例如,numpy.sum(M, axis=0)正常工作,而numpy.diagonal(M)给出错误提示{ValueError}diag requires an array of at least two dimensions.

For example, numpy.sum(M, axis=0) works fine while numpy.diagonal(M) gives an error saying {ValueError}diag requires an array of at least two dimensions.

那么为什么一个矩阵函数可以在M上工作而另一个函数不能在M上工作呢?

So is there a rationale behind why one matrix function works on M while the other does not?

还有一个问题是,鉴于上述numpy.diagonal的规定,如何从CSR矩阵中获取对角线元素呢?

And a bonus question is, how to get the diagonal elements from a CSR matrix given the above numpy.diagonal does not work for it?

推荐答案

np.diagonal的代码是:

return asanyarray(a).diagonal(offset=offset, axis1=axis1, axis2=axis2)

也就是说,它首先尝试将参数转换为数组,例如,如果它是列表的列表.但这不是将稀疏矩阵转换为ndarray的正确方法.

That is, it first tries to turn the argument into an array, for example if it is a list of lists. But that isn't the right way to turn a sparse matrix into a ndarray.

In [33]: from scipy import sparse                                               
In [34]: M = sparse.csr_matrix(np.eye(3))                                       
In [35]: M                                                                      
Out[35]: 
<3x3 sparse matrix of type '<class 'numpy.float64'>'
    with 3 stored elements in Compressed Sparse Row format>
In [36]: M.A                                  # right                                  
Out[36]: 
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])
In [37]: np.asanyarray(M)                    # wrong                           
Out[37]: 
array(<3x3 sparse matrix of type '<class 'numpy.float64'>'
    with 3 stored elements in Compressed Sparse Row format>, dtype=object)

使用np.diagonal的正确方法是:

In [38]: np.diagonal(M.A)                                                       
Out[38]: array([1., 1., 1.])

但是没有必要. M已经具有diagonal方法:

But no need for that. M already has a diagonal method:

In [39]: M.diagonal()                                                           
Out[39]: array([1., 1., 1.])

np.sum确实有效,因为它将操作委托给了一个方法(请看其代码):

np.sum does work, because it delegates the action to a method (look at its code):

In [40]: M.sum(axis=0)                                                          
Out[40]: matrix([[1., 1., 1.]])
In [41]: np.sum(M, axis=0)                                                      
Out[41]: matrix([[1., 1., 1.]])

作为一般规则,请尝试在稀疏矩阵上使用sparse函数和方法.不要指望numpy函数正常工作. sparse建立在numpy上,但是numpy不了解" sparse.

As a general rule, try to use sparse functions and methods on sparse matrices. Don't count on numpy functions to work right. sparse is built on numpy, but numpy does not 'know' about sparse.

这篇关于可以/不能在压缩稀疏行(CSR)矩阵上使用的numpy函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆