随着数据框变大,如何防止 rbind() 变得非常慢? [英] How can I prevent rbind() from geting really slow as dataframe grows larger?

查看:19
本文介绍了随着数据框变大,如何防止 rbind() 变得非常慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个只有 1 行的数据框.为此,我开始使用 rbind 添加行

I have a dataframe with only 1 row. To this I start to add rows by using rbind

df #mydataframe with only one row
for (i in 1:20000)
{
    df<- rbind(df, newrow)

}

随着我的成长,这变得非常缓慢.这是为什么?以及如何使这种类型的代码更快?

this gets very slow as i grows. Why is that? and how can I make this type of code faster?

推荐答案

你在第二圈地狱,即未能预先分配数据结构.

You are in the 2nd circle of hell, namely failing to pre-allocate data structures.

以这种方式增长对象在 R 中是一件非常非常糟糕的事情.预分配和插入:

Growing objects in this fashion is a Very Very Bad Thing in R. Either pre-allocate and insert:

df <- data.frame(x = rep(NA,20000),y = rep(NA,20000))

或重组您的代码以避免这种增量添加行.正如我在引用的链接中所讨论的那样,速度缓慢的原因是每次添加一行时,R 都需要找到一个新的连续内存块来适应数据框.大量复制.

or restructure your code to avoid this sort of incremental addition of rows. As discussed at the link I cite, the reason for the slowness is that each time you add a row, R needs to find a new contiguous block of memory to fit the data frame in. Lots 'o copying.

这篇关于随着数据框变大,如何防止 rbind() 变得非常慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆