R中矩阵的最大大小 [英] maximum size of a matrix in R

查看:312
本文介绍了R中矩阵的最大大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用igraph进行一些网络分析.作为其中的一部分,我必须创建一个包含2列和与链接一样多的行的矩阵.我有一个大型网络(数百万个链接),并且在运行3小时后创建此矩阵无法正常工作(没有错误,只是没有结果,并且显示没有响应").

I am using igraph to do some network analysis. As part of that, I have to create a matrix with 2 columns and as many rows as there are links. I have a large network (several million links) and creating this matrix didn't work after 3 hours of run time (no errors, just no result, and it shows "not responding").

这种字符矩阵的最大大小是多少?运行需要多长时间?

What is the maximum size of such a character matrix? How long does it take to run?

我正在Windows 7计算机上运行64位R 2.14.1,该计算机具有4 GB内存,运行速度为2.67 Ghz

I am running 64 bit R 2.14.1, on a Windows 7 machine with 4 GB of memory running at 2.67 Ghz

谢谢

已添加 感谢您的快速回复.这使我很肯定,它不是矩阵的大小.原来是我使用另一个矩阵的列创建该矩阵时出错.

ADDED Thanks for the quick responses. This made me positive it wasn't the size of the matrix; it turned out to be an error in which columns of another matrix I was using to create that matrix.

推荐答案

R中向量的理论极限为2147483647个元素.所以大约是10亿行/2列.

The theoretical limit of a vector in R is 2147483647 elements. So that's about 1 billion rows / 2 columns.

...但是数据量不足以容纳4 GB的内存...尤其是字符向量中的字符串不适合使用.每个字符串至少为96个字节(object.size('a') == 96),并且矩阵中的每个元素都是一个指向该字符串的指针(8个字节)(尽管每个唯一字符串只有一个实例).

...but that amount of data does not fit in 4 GB of memory... And especially not with strings in a character vector. Each string is at least 96 bytes (object.size('a') == 96), and each element in your matrix will be a pointer (8 bytes) to such a string (there is only one instance of each unique string though).

因此,通常发生的情况是计算机开始使用虚拟内存并开始交换.大量交换通常会扼杀本世纪完成的所有希望,尤其是在Windows上.

So what typically happens is that the machine starts using virtual memory and start swapping. Heavy swapping typically kills all hope of ever finishing in this century - especially on Windows.

但是,如果您使用的是包(igraph?),并且要求它生成矩阵,则它可能会进行大量内部工作并创建许多辅助对象.因此,即使您距离单个结果矩阵的内存限制还很远,用于生成它的算法也会耗尽内存.时间上也可能是非线性的(二次或更糟),这将再次扼杀本世纪完成的所有希望...

But if you are using a package (igraph?) and you're asking it to produce the matrix, it probably does a lot of internal work and creates lots of auxiliary objects. So even if you're nowhere near the memory limit for the single result matrix, the algorithm used to produce it can run out of memory. It can also be non-linear (quadratic or worse) in time, which would again kill all hope of ever finishing in this century...

一种研究的好方法可能是将其计时在一个小图上(例如使用system.time),然后将图的大小加倍两次.然后,您可以查看时间是线性的还是二次的,并且可以估算完成大图所需的时间.如果预测显示一周,那么您就知道;-)

A good way to investigate could be to time it on a small graph (e.g. using system.time), and the again when doubling the graph size a couple of times. Then you can see if the time is linear or quadratic and you can estimate how long it will take to complete your big graph. If the prediction says a week, well then you know ;-)

这篇关于R中矩阵的最大大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆