SciPy和Numpy之间的伪逆的区别 [英] The difference of pseudo-inverse between SciPy and Numpy
问题描述
我发现有pinv()
函数的两个版本,可以计算Scipy
和numpy
中矩阵的伪逆,可以在以下位置查看文档:
I found that there're two versions of pinv()
function, which calculates the pseudo-inverse of a matrix in Scipy
and numpy
, the documents can be viewed at:
http://docs.scipy.org/doc/numpy/reference /generation/numpy.linalg.pinv.html
http://docs.scipy.org/doc/scipy/reference /generation/scipy.linalg.pinv.html
问题是我有一个50000 * 5000的矩阵,使用scipy.linalg.pinv
时,它花费了我超过20GB的内存.但是当我使用numpy.linalg.pinv
时,仅使用了不到1GB的内存.
The problem is that I have a 50000*5000 matrix, when using scipy.linalg.pinv
, it costs me more than 20GB of memory. But when I use numpy.linalg.pinv
, only less than 1GB of memory is used..
我想知道为什么numpy
和scipy
都在不同的实现下都具有pinv
.以及为什么他们的表现如此不同.
I was wondering why numpy
and scipy
both have a pinv
under different implemention. And why their performances are so different.
推荐答案
我不能说为什么scipy和numpy都有实现,但是我可以解释为什么行为不同.
I can't speak as to why there are implementations in both scipy and numpy, but I can explain why the behaviour is different.
numpy.linalg.pinv
使用SVD逼近Moore-Penrose伪逆(精确地说是lapack方法dgesdd
),而scipy.linalg.pinv
在最小二乘意义上求解模型线性系统以近似伪逆(使用dgelss
).这就是为什么它们的性能不同的原因.我希望得到的伪逆估计的总体准确性也会有所不同.
numpy.linalg.pinv
approximates the Moore-Penrose psuedo inverse using an SVD (the lapack method dgesdd
to be precise), whereas scipy.linalg.pinv
solves a model linear system in the least squares sense to approximate the pseudo inverse (using dgelss
). This is why their performance is different. I would expect the overall accuracy of the resulting pseudo inverse estimates to be somewhat different as well.
您可能会发现scipy.linalg.pinv2
的性能与numpy.linalg.pinv
更加相似,因为它也使用SVD方法,而不是最小二乘方近似.
You might find that scipy.linalg.pinv2
performs more similarly to numpy.linalg.pinv
, as it too uses an SVD method, rather than a least sqaures approximation.
这篇关于SciPy和Numpy之间的伪逆的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!