计算R中的稀疏矩阵的特征向量 [英] Computing eigenvectors of a sparse matrix in R
问题描述
我正在尝试计算R中一个大型稀疏矩阵的m
个第一特征向量.使用eigen()
是不现实的,因为在这里,大型表示N> 10 6 .
I am trying to compute the m
first eigenvectors of a large sparse matrix in R. Using eigen()
is not realistic because large means N > 106 here.
到目前为止,我已经确定应该使用igraph
软件包中的ARPACK,该软件包可以处理稀疏矩阵.但是我无法在一个非常简单的(3x3)矩阵上工作:
So far I figured out that I should use ARPACK from the igraph
package, which can deal with sparse matrices. However I can't get it to work on a very simple (3x3) matrix:
library(Matrix)
library(igraph)
TestDiag <- Diagonal(3, 3:1)
TestMatrix <- t(sparseMatrix(i = c(1, 1, 2, 2, 3), j = c(1, 2, 1, 2, 3), x = c(3/5, 4/5, -4/5, 3/5, 1)))
TestMultipliedMatrix <- t(TestMatrix) %*% TestDiag %*% TestMatrix
然后使用arpack()
函数帮助示例中给出的代码提取2个第一个特征向量:
And then using the code given in example of the help of the arpack()
function to extract the 2 first eigenvectors :
func <- function(x, extra=NULL) { as.vector(TestMultipliedMatrix %*% x) }
arpack(func, options=list(n = 3, nev = 2, ncv = 3, sym=TRUE, which="LM", maxiter=200), complex = FALSE)
我收到一条错误消息:
Error in arpack(func, options = list(n = 3, nev = 2, ncv = 3, sym = TRUE, :
At arpack.c:1156 : ARPACK error, NCV must be greater than NEV and less than or equal to N
我不理解此错误,因为ncv(3)大于nev(2),等于N(3).
I don't understand this error, as ncv (3) is greater than nev (2) here, and equal to N (3).
我犯了一些愚蠢的错误,还是有更好的方法来计算R中稀疏矩阵的特征向量?
Am I making some stupid mistake or is there a better way to compute eigenvectors of a sparse matrix in R?
更新
此错误显然是由于arpack()
函数中的大写/小写NCV和NEV错误所致.
This error is apparently due to a bug in the arpack()
function with uppercase / lowercase NCV and NEV.
欢迎提出解决该错误的任何建议(我试图看一下程序包代码,但是对我来说太复杂了)或以其他方式计算特征向量.
Any suggestions to solve the bug (I tried to have a look at the package code but it is far too complex for me to understand) or compute the eigenvectors in an other way are welcome.
推荐答案
实际上这里没有错误,但是您将sym=TRUE
放入ARPACK选项列表时犯了一个错误,但是sym
是
There are actually no bugs here, but you made a mistake putting sym=TRUE
into the ARPACK option list, but sym
is an argument of the arpack()
function. I.e. the correct call is:
ev <- arpack(func, options=list(n=3, nev=2, ncv=3, which="LM", maxiter=200),
sym=TRUE, complex = FALSE)
ev$values
# [1] 3 2
ev$vectors
# [,1] [,2]
# [1,] -6.000000e-01 -8.000000e-01
# [2,] 8.000000e-01 -6.000000e-01
# [3,] 2.220446e-16 -9.714451e-17
如果您对这些细节感兴趣,则会发生这种情况,即调用非常规的本征求解器而不是对称的本征求解器,并且对于NCV-NEV> = 2也是必需的.从ARPACK来源(dnaupd.f):
If you are interested in the details, what happens is that instead of the symmetric, the general non-symmetric eigensolver is called and for that NCV-NEV >= 2 is also required. From the ARPACK source (dnaupd.f):
...
c NOTE: 2 <= NCV-NEV in order that complex conjugate pairs of Ritz
c values are kept together. (See remark 4 below)
...
更多评论,仅与您的问题无关. arpack()
可能很慢.它的问题在于,您需要在每次迭代中从C代码中回调R.看到此线程: http://lists.gnu.org/archive/html/igraph-help/2012-02/msg00029.html
最重要的是,arpack()
仅在矩阵向量乘积回调快速且您不需要多次迭代时才有帮助,后者与矩阵的本征结构有关.
Some more comments, only loosely related to your question. arpack()
can be quite slow. The problem with it is that you need to call back to R from the C code in each iteration. See this thread: http://lists.gnu.org/archive/html/igraph-help/2012-02/msg00029.html
The bottom line is that arpack()
only helps if your matrix-vector product callback is fast and you don't need many iterations, the latter being related to the eigenstructure of the matrix.
我在igraph问题跟踪器中创建了一个问题,以查看是否可以选择使用Rcpp而不是R回调来使用C回调:
I created an issue in the igraph issue tracker, to see if it would be possible to optionally use C callback, using Rcpp, instead of the R callback: https://github.com/igraph/igraph/issues/491 You can follow this issue if you are interested.
这篇关于计算R中的稀疏矩阵的特征向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!