R中是否有像bigmemory这样的程序包可以处理大型列表对象? [英] Is there a package like bigmemory in R that can deal with large list objects?

查看:469
本文介绍了R中是否有像bigmemory这样的程序包可以处理大型列表对象?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道R包bigmemory在处理大型矩阵和数据帧时效果很好.但是,我想知道是否有任何软件包或任何方法可以有效地处理大型列表.

I know that the R package bigmemory works great in dealing with large matrices and data frames. However, I was wondering if there is any package or any ways to efficiently work with large list.

具体来说,我创建了一个列表,其元素为矢量.我有一个for循环,在每次迭代过程中,多个值都附加到该列表中的选定元素(向量)上.最初,它运行速度很快,但是当迭代次数超过10000时,它会逐渐变慢(一次迭代大约需要一秒钟).我将经历大约70000至80000次迭代,此后列表会很大.

Specifically, I created a list with its elements being vectors. I have a for loop and during each iteration, multiple values were appended to a selected element in that list (a vector). At first, it runs fast, but when the iteration is over maybe 10000, it slows down gradually (one iteration takes about a second). I'm going to go through about 70000 to 80000 iterations, and the list would be so large after that.

所以我只是想知道bigmemory包中是否有诸如big.list之类的big.matrix这样的东西可以加快整个过程.

So I was just wondering if there is something like big.list as the big.matrix in the bigmemory package that could speed up this whole process.

谢谢!

推荐答案

我不确定这是否有帮助,但是您可以使用filehash包以交互方式使用磁盘上的列表.

I'm not really sure if this a helpful answer, but you can interactively work with lists on disk using the filehash package.

例如,下面的代码创建了一个磁盘数据库,将一个预先分配的空列表分配给数据库,然后运行一个函数(获取当前时间)以将该列表填充到数据库中.

For example here's some code that makes a disk database, assigns a preallocated empty list to the database, then runs a function (getting the current time) that fills the list in the database.

# how many items in the list?
n <- 100000
# setup database on disk
dbCreate("testDB") 
db <- dbInit("testDB")
# preallocate vector in database
db$time <- vector("list", length = n)
# run function using disk object
for(i in 1:n) db$time[[i]] <- Sys.time()

在此过程中几乎没有使用RAM,但是由于磁盘I/O恒定,它非常很慢(比在某些测试中在RAM中慢两个数量级). .因此,我不确定此方法是否可以很好地解决如何加快处理大对象的问题.

There is hardly any use of RAM during this process, however it is VERY slow (two orders of magnitude slower than doing it in RAM on some of my tests) due to constant disk I/O. So I'm not sure that this method is a good answer to the question of how you can speed up working on big objects.

这篇关于R中是否有像bigmemory这样的程序包可以处理大型列表对象?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆