如何在R中获得一个大的稀疏矩阵?(> 2^31-1) [英] How to get a big sparse matrix in R? (> 2^31-1)

查看:33
本文介绍了如何在R中获得一个大的稀疏矩阵?(> 2^31-1)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用一些 C++ 代码从数据库中获取一个文本文件,并从 Matrix 包中创建一个 dgcMatrix 类型的稀疏矩阵.我第一次尝试构建一个具有超过 2^31-1 个非稀疏成员的矩阵,这意味着稀疏矩阵对象中的索引向量也必须长于该限制.不幸的是,向量似乎使用 32 位整数索引,就像 Rcpp 中的 NumericVectors 一样.

I use some C++ code to take a text file from a database and create a dgcMatrix type sparse matrix from the Matrix package. For the first time, I'm trying to build a matrix that has more than 2^31-1 non-sparse members, which means that the index vector in the sparse matrix object must also be longer than that limit. Unfortunately, vectors seem to use 32-bit integer indices, as do NumericVectors in Rcpp.

除了从头开始编写全新的数据类型之外,R 是否为此提供了任何便利?我不认为我可以使用太奇特的解决方案,因为我需要 glmnet 来识别结果对象.

Short of writing an entire new data type from the ground up, does R provide any facility for this? I don't think I can use too exotic a solution as I need glmnet to recognize the resultant object.

推荐答案

在 R 的最新版本中,向量由 R_xlen_t 类型索引,该类型在 64 位平台上为 64 位,仅 int 在 32 位平台上.

In recent versions of R, vectors are indexed by the R_xlen_t type, which is 64 bits on 64 bits platforms and just int on 32 bit platforms.

Rcpp 到目前为止仍然在任何地方使用 int.我鼓励您在他们的问题列表上请求该功能.这并不难,但需要有技能、有时间和意愿的人系统地参与.Rcpp11 的开发版本使用了正确的类型,也许他们可以将其用作一个模型.

Rcpp so far still uses int everywhere. I would encourage you to request the feature on their issue list. It is not hard, but needs systematic involvement of someone with skills, time and willingness. The development version of Rcpp11 uses the correct type, perhaps they can use that as a model.

但是请注意,即使 R 在 64 位平台上使用 64 位无符号整数,您实际上也仅限于 double 类型可以处理的范围,这就是 R 会给您的如果您要求矢量的 length.R 没有它可以本地表示的 64 位整数类型,因此当您要求向量的长度时,您会根据值获得 intdouble.

Note however that even though R uses 64 bit unsigned integers on 64 bit plaforms, you are in fact limited to the range that can be handled by the double type, which is what R will give you if you ask for the length of a vector. R has no 64 bit integer type that it can represent natively, so when you ask for the length of a vector you either get an int or a double depending on the value.

这篇关于如何在R中获得一个大的稀疏矩阵?(> 2^31-1)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆