计算R中的稀疏矩阵的特征向量 [英] Computing eigenvectors of a sparse matrix in R

查看:221
本文介绍了计算R中的稀疏矩阵的特征向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试计算R中一个大型稀疏矩阵的m个第一特征向量.使用eigen()是不现实的,因为在这里,大型表示N> 10 6 .

I am trying to compute the m first eigenvectors of a large sparse matrix in R. Using eigen() is not realistic because large means N > 106 here.

到目前为止,我已经确定应该使用igraph软件包中的ARPACK,该软件包可以处理稀疏矩阵.但是我无法在一个非常简单的(3x3)矩阵上工作:

So far I figured out that I should use ARPACK from the igraph package, which can deal with sparse matrices. However I can't get it to work on a very simple (3x3) matrix:

library(Matrix)
library(igraph)

TestDiag <- Diagonal(3, 3:1)
TestMatrix <- t(sparseMatrix(i = c(1, 1, 2, 2, 3), j = c(1, 2, 1, 2, 3), x = c(3/5, 4/5, -4/5, 3/5, 1)))
TestMultipliedMatrix <- t(TestMatrix) %*% TestDiag %*% TestMatrix

然后使用arpack()函数帮助示例中给出的代码提取2个第一个特征向量:

And then using the code given in example of the help of the arpack() function to extract the 2 first eigenvectors :

func <- function(x, extra=NULL) { as.vector(TestMultipliedMatrix %*% x) } 
arpack(func, options=list(n = 3, nev = 2, ncv = 3, sym=TRUE, which="LM", maxiter=200), complex = FALSE)

我收到一条错误消息:

Error in arpack(func, options = list(n = 3, nev = 2, ncv = 3, sym = TRUE,  :
  At arpack.c:1156 : ARPACK error, NCV must be greater than NEV and less than or equal to N

我不理解此错误,因为ncv(3)大于nev(2),等于N(3).

I don't understand this error, as ncv (3) is greater than nev (2) here, and equal to N (3).

我犯了一些愚蠢的错误,还是有更好的方法来计算R中稀疏矩阵的特征向量?

Am I making some stupid mistake or is there a better way to compute eigenvectors of a sparse matrix in R?

更新

此错误显然是由于arpack()函数中的大写/小写NCV和NEV错误所致.

This error is apparently due to a bug in the arpack() function with uppercase / lowercase NCV and NEV.

欢迎提出解决该错误的任何建议(我试图看一下程序包代码,但是对我来说太复杂了)或以其他方式计算特征向量.

Any suggestions to solve the bug (I tried to have a look at the package code but it is far too complex for me to understand) or compute the eigenvectors in an other way are welcome.

推荐答案

实际上这里没有错误,但是您将sym=TRUE放入ARPACK选项列表时犯了一个错误,但是sym功能. IE.正确的呼叫是:

There are actually no bugs here, but you made a mistake putting sym=TRUE into the ARPACK option list, but sym is an argument of the arpack() function. I.e. the correct call is:

ev <- arpack(func, options=list(n=3, nev=2, ncv=3, which="LM", maxiter=200), 
             sym=TRUE, complex = FALSE)
ev$values
# [1] 3 2
ev$vectors
#               [,1]          [,2]
# [1,] -6.000000e-01 -8.000000e-01
# [2,]  8.000000e-01 -6.000000e-01
# [3,]  2.220446e-16 -9.714451e-17

如果您对这些细节感兴趣,则会发生这种情况,即调用非常规的本征求解器而不是对称的本征求解器,并且对于NCV-NEV> = 2也是必需的.从ARPACK来源(dnaupd.f):

If you are interested in the details, what happens is that instead of the symmetric, the general non-symmetric eigensolver is called and for that NCV-NEV >= 2 is also required. From the ARPACK source (dnaupd.f):

...
c          NOTE: 2 <= NCV-NEV in order that complex conjugate pairs of Ritz 
c          values are kept together. (See remark 4 below)
...

更多评论,仅与您的问题无关. arpack()可能很慢.它的问题在于,您需要在每次迭代中从C代码中回调R.看到此线程: http://lists.gnu.org/archive/html/igraph-help/2012-02/msg00029.html 最重要的是,arpack()仅在矩阵向量乘积回调快速且您不需要多次迭代时才有帮助,后者与矩阵的本征结构有关.

Some more comments, only loosely related to your question. arpack() can be quite slow. The problem with it is that you need to call back to R from the C code in each iteration. See this thread: http://lists.gnu.org/archive/html/igraph-help/2012-02/msg00029.html The bottom line is that arpack() only helps if your matrix-vector product callback is fast and you don't need many iterations, the latter being related to the eigenstructure of the matrix.

我在igraph问题跟踪器中创建了一个问题,以查看是否可以选择使用Rcpp而不是R回调来使用C回调:

I created an issue in the igraph issue tracker, to see if it would be possible to optionally use C callback, using Rcpp, instead of the R callback: https://github.com/igraph/igraph/issues/491 You can follow this issue if you are interested.

这篇关于计算R中的稀疏矩阵的特征向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆