R中的大型矩阵:尚不支持长向量 [英] Large Matrices in R: long vectors not supported yet

查看:166
本文介绍了R中的大型矩阵:尚不支持长向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在具有400GB RAM的64位Ubuntu环境中运行64位R 3.1,在处理大型矩阵时遇到了一个奇怪的限制.

I am running 64 bit R 3.1 in a 64bit Ubuntu environment with 400GB of RAM, and I am encountering a strange limitation when dealing with large matrices.

我有一个称为A的数字矩阵,即4000行乘950,000列.当我尝试访问其中的任何元素时,都会出现以下错误:

I have a numeric matrix called A, that is 4000 rows by 950,000 columns. When I try to access any element in it, I receive the following error:

Error: long vectors not supported yet: subset.c:733

尽管我的矩阵是通过scan读取的,但是您可以使用以下代码进行复制

Although my matrix was read in via scan, you can replicate with the following code

test <- matrix(1,4000,900000) #no error
test[1,1] #error

我的Google搜索显示这是R 3.0之前的常见错误消息,其中大小为2 ^ 31-1的向量为限制.但是,鉴于我的环境,情况并非如此.

My Googling reveals this was a common error message prior to R 3.0, where a vector of size 2^31-1 was the limit. However, this is not the case, given my environment.

我不应该将本机矩阵类型用于这种矩阵吗?

Should I not be using the native matrix type for this kind of matrix?

推荐答案

矩阵只是一个具有标注属性的原子向量,它允许R将其作为矩阵进行访问.您的矩阵是一个长度为4000*9000000的向量,该向量为3.6e+10个元素(最大整数约为2.147e+9).原子向量支持对长向量 进行子集设置(即访问超出2.147e+9限制的元素).只需将矩阵视为长向量即可.

A matrix is just an atomic vector with a dimension attribute which allows R to access it as a matrix. Your matrix is a vector of length 4000*9000000 which is 3.6e+10 elements (the largest integer value is approx 2.147e+9). Subsetting a long vector is supported for atomic vectors (i.e. accessing elements beyond the 2.147e+9 limit). Just treat your matrix as a long vector.

如果我们记得默认情况下R会按列填充矩阵,那么如果要检索说test[ 2701 , 850000 ]处的值,我们可以通过以下方式访问它:

If we remember that by default R fills matrices column-wise then if we wanted to retrieve say the value at test[ 2701 , 850000 ] we could access it via:

i <- ( 2701 - 1 ) * 850000 + 2701 
test[i]
#[1] 1

请注意,这确实是长向量子集,因为:

Note that this really is long vector subsetting because:

2701L * 850000L
#[1] NA
#Warning message:
#In 2701L * 850000L : NAs produced by integer overflow

这篇关于R中的大型矩阵:尚不支持长向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆