批量rbind.fill许多数据帧 [英] Mass rbind.fill for many data frames

查看:468
本文介绍了批量rbind.fill许多数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图将许多数据帧绑定在一起成为一个巨大的数据框。数据框按顺序命名,第一个命名为 df1 ,第二个命名为 df2 ,第三个命名为 df3 等。目前,我通过显式输入数据框的名称将这些数据框绑定在一起;然而,对于非常大量的数据帧(预计大约有10,000个总数据帧),这是不理想的。

这是一个工作示例:

 #加载所需的包
库(plyr)

#生成100个示例数据帧
for我在1:100){
assign(paste0('df',i),data.frame(x = rep(1:100),
y = seq(from = 1,
to = 1000,
length = 100)))
}
}

#创建一个主合并数据框
df< - rbind.fill (df1,df2,df3,df4,df5,df6,df7,df8,df9,df10,
df11,df12,df13,df14,df15,df16,df17,df18,df19,df20,
df21 ,df22,df23,df24,df25,df26,df27,df28,df29,df30,
df31,df32,df33,df34,df35,df36,df37,df38,df39,df40,
df41,df42 ,df43,df44,df45,df46,df47,df48,df49,d f50,
df51,df52,df53,df54,df55,df56,df57,df58,df59,df60,
df61,df62,df63,df64,df65,df66,df67,df68,df69,df70,
df71,df72,df73,df74,df75,df76,df77,df78,df79,df80,
df81,df82,df83,df84,df85,df86,df87,df88,df89,df90,
df91,df92,df93,df94,df95,df96,df97,df98,df99,df100)

任何想法如何优化这将不胜感激。

解决方案

或与 data.table: :rbindlist 。设置填充为true以处理缺失的值(如果有的话)。

 <$ c $填充= TRUE)

xy
1:1 1.00000
2:2 11.09091
3: 3 21.18182
4:4 31.27273
5:5 41.36364
---
9996:96 959.63636
9997:97 969.72727
9998:98 979.81818
9999:99 989.90909
10000:100 1000.00000


I am attempting to row bind many data frames together into a single massive data frame. The data frames are named sequentially with the first named df1, the second named df2, the third named df3, etc. Currently, I have bound these data frames together by explicitly typing the names of the data frames; however, for a very large number of data frames (roughly 10,000 total data frames are expected) this is suboptimal.

Here is a working example:

# Load required packages
library(plyr)

# Generate 100 example data frames
for(i in 1:100){
   assign(paste0('df', i), data.frame(x = rep(1:100),
                                      y = seq(from = 1,
                                              to = 1000,
                                              length = 100)))
  }
}

# Create a master merged data frame
 df <- rbind.fill(df1, df2, df3, df4, df5, df6, df7, df8, df9, df10,
             df11, df12, df13, df14, df15, df16, df17, df18, df19, df20,
             df21, df22, df23, df24, df25, df26, df27, df28, df29, df30,
             df31, df32, df33, df34, df35, df36, df37, df38, df39, df40,
             df41, df42, df43, df44, df45, df46, df47, df48, df49, df50,
             df51, df52, df53, df54, df55, df56, df57, df58, df59, df60,
             df61, df62, df63, df64, df65, df66, df67, df68, df69, df70,
             df71, df72, df73, df74, df75, df76, df77, df78, df79, df80,
             df81, df82, df83, df84, df85, df86, df87, df88, df89, df90,
             df91, df92, df93, df94, df95, df96, df97, df98, df99, df100)

Any thoughts on how to optimize this would be greatly appreciated.

解决方案

Or with data.table::rbindlist . Set fill to true to take care of the missing values, if any.

rbindlist(mget(ls(pattern="df")), fill=TRUE)

         x          y
    1:   1    1.00000
    2:   2   11.09091
    3:   3   21.18182
    4:   4   31.27273
    5:   5   41.36364
   ---               
 9996:  96  959.63636
 9997:  97  969.72727
 9998:  98  979.81818
 9999:  99  989.90909
10000: 100 1000.00000

这篇关于批量rbind.fill许多数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆