得到“节点栈溢出".当绑定多个稀疏矩阵时 [英] Getting "node stack overflow" when cbind multiple sparse matrices

查看:106
本文介绍了得到“节点栈溢出".当绑定多个稀疏矩阵时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在列表对象中存储了100,000个稀疏矩阵("dgCMatrix").每个矩阵的行号相同(8,000,000),列表的大小约为25 Gb.现在,当我这样做时:

I have 100,000 sparse matrices("dgCMatrix") store in a list object. The row number of every matrix is the same(8,000,000) and the size of the list is approximately 25 Gb. Now when I do:

do.call(cbind, theListofMatrices)

将所有矩阵组合成一个大的稀疏矩阵,我得到了节点堆栈溢出".实际上,我什至不能仅使用该列表中的500个元素来执行此操作,该列表应该输出仅100 Mb大小的稀疏矩阵.

to combine all matrices into one big sparse matrix, I got "node stack overflow". Actually, I can't even do this with only 500 elements out of that list, which should output a sparse matrix with a size of only 100 Mb.

我对此的猜测是 cbind()函数将稀疏矩阵转换为正常的密集矩阵,从而导致堆栈溢出?

My speculation for this is that the cbind() function transformed the sparse matrix to a normal dense matrix and thus cause the stack overflow?

实际上,我已经尝试过类似的事情:

Actually, I have tried something like this:

tmp = do.call(cbind, theListofMatrices[1:400])

这很好用,并且tmp仍然是稀疏矩阵,大小为95 Mb,然后我尝试了:

this works fine, and tmp is still a sparse matrix with a size of 95 Mb, and then I tried:

> tmp = do.call(cbind, theListofMatrices[1:410])
Error in stopifnot(0 <= deparse.level, deparse.level <= 2) : 
  node stack overflow

,然后发生错误.但是,我可以轻松执行以下操作:

and then the error occurred. However, I am having no trouble doing something like:

cbind(tmp, tmp, tmp, tmp)

因此,我相信这与do.call()

thus, I believe it has something to do with do.call()

Reduce()似乎可以解决我的问题,尽管我仍然不知道do.call()失败的原因.

Reduce() seems to solve my problem, though I still don't know the reason why do.call() crushes.

推荐答案

问题不是在do.call()中,而是由于实现了Matrix包中的cbind.它使用递归将各个参数绑定在一起.例如,将Matrix::cbind(mat1, mat2, mat3)转换为与Matrix::cbind(mat1, Matrix::cbind(mat2, mat3))相似的内容. 由于do.call(cbind, theListofMatrices)本质上是cbind(theListofMatrices[[1]], theListofMatrices[[2]], ...),因此您对cbind函数使用的参数过多,最终您将得到嵌套太深的递归,并且该递归将失败.

The problem is not in do.call() but due to the way cbind from the Matrix package is implemented. It uses recursion to bind the individual arguments together. For instance, Matrix::cbind(mat1, mat2, mat3) is translated to something along the lines of Matrix::cbind(mat1, Matrix::cbind(mat2, mat3)). Since do.call(cbind, theListofMatrices) is basically cbind(theListofMatrices[[1]], theListofMatrices[[2]], ...) you have too many arguments to the cbind function and you will end up with a recursion that's nested too deeply and it will fail.

因此, Ben的评论使用Reduce()是解决此问题的好方法,因为它避免了递归,而是将其替换为迭代:

Thus, Ben's comment to use Reduce() is a good way to work around that issue since it avoids the recursion and replaces it with an iteration:

tmp <- Reduce(cbind, theListofMatrices[-1], theListofMatrices[[1]])

这篇关于得到“节点栈溢出".当绑定多个稀疏矩阵时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆