有效地合并R中的数据帧列表 [英] merging lists of dataframes in R effectively

查看:81
本文介绍了有效地合并R中的数据帧列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有12组4个列表,其中包含2到13个数据帧
我想将它们合并为1组4个数据帧列表。

I have 12 sets of 4 lists that contain between 2 to 13 data frames I want to merge them into 1 set of 4 lists of data frames.

我称它们为集合,它们被简单地存储在Global Environment中:
list_a_1,
list_b_1,
list_c_1,
list_d_1,
list_a_2,
list_b_2,
list_c_2,
list_d_2,
...

While I call them sets, they are simply stored in Global Environment as: list_a_1, list_b_1, list_c_1, list_d_1, list_a_2, list_b_2, list_c_2, list_d_2, ...

list_a_1〜list_a_12的数据帧将具有完全相同的名称和相同的列。

list_a_1 ~ list_a_12 would have data frames that have the exact same names and same columns.

我希望得到的结果是4个列表,其中包含所有合并的12组数据框。

My desired outcome is 4 lists containing all 12 sets of dataframe merged.


df1 = data.frame(A = 1:5, B = 100:104)
df2 = data.frame(C = 6:10, D = 100:104)

list_a_1 = list(df1, df2)
list_a_2 = list(df1, df2)



desired_outcome
df1
   A   B
1  1 100
2  2 101
3  3 102
4  4 103
5  5 104
6  1 100
7  2 101
8  3 102
9  4 103
10 5 104

df2
    C   D
1   6 100
2   7 101
3   8 102
4   9 103
5  10 104
6   6 100
7   7 101
8   8 102
9   9 103
10 10 104

我试图用rbind,append,merge等编写函数,目的是与lapply一起使用,但是似乎无法正确解决。
因为每个列表都很大,所以效率也是一个重要的因素。

I tried writing a function with rbind, append, merge ... etc. with an aim to use it with lapply, but cannot seem to get it right. Since each list is quite large, efficiency is also an important factor.

推荐答案

因为这些都是与之对应的要素 rbind ,在 base R

As these are correspoinding elements to be rbind, use Map in base R

Map(rbind, list_a_1, list_a_2)
#[[1]]
#   A   B
#1  1 100
#2  2 101
#3  3 102
#4  4 103
#5  5 104
#6  1 100
#7  2 101
#8  3 102
#9  4 103
#10 5 104

#[[2]]
#    C   D
#1   6 100
#2   7 101
#3   8 102
#4   9 103
#5  10 104
#6   6 100
#7   7 101
#8   8 102
#9   9 103
#10 10 104

或在序列上循环一个列表中的一个,根据索引和 rbind

Or loop over the sequence of one list, extract each based on the index and rbind

lapply(seq_along(list_a_1), function(i) rbind(list_a_1[[i]], list_a_2[[i]]))




对于多个列表,我们可以使用

v1 <- paste0('list_', letters[1:4], "_", rep(1:2, each = 4))

然后使用 mget

lst1 <- mget(v1)

或者可以使用正则表达式模式自动完成

Or this can be done automatically with a regex pattern

list_b_1 <- list_a_1
list_b_2 <- list_a_2
list_c_1 <- list_a_1
list_c_2 <- list_a_2
list_d_1 <- list_a_1
list_d_2 <- list_a_2
nms <- ls(pattern = '^list_[a-d]_\\d+$')
lst1 <- mget(nms)
grps <- sub("list_([a-d])_\\d+", "\\1", nms)
lst2 <- split(lst1, grps)
out <- lapply(lst2, function(lstnew) do.call(Map, c(f = rbind, unname(lstnew))))

-检查输出

out$a
[[1]]
   A   B
1  1 100
2  2 101
3  3 102
4  4 103
5  5 104
6  1 100
7  2 101
8  3 102
9  4 103
10 5 104

[[2]]
    C   D
1   6 100
2   7 101
3   8 102
4   9 103
5  10 104
6   6 100
7   7 101
8   8 102
9   9 103
10 10 104

对于'd'对象

out$d
[[1]]
   A   B
1  1 100
2  2 101
3  3 102
4  4 103
5  5 104
6  1 100
7  2 101
8  3 102
9  4 103
10 5 104

[[2]]
    C   D
1   6 100
2   7 101
3   8 102
4   9 103
5  10 104
6   6 100
7   7 101
8   8 102
9   9 103
10 10 104




purrr map2 >


Or map2 from purrr

library(dplyr)
library(purrr)
map2(list_a_1, list_a_2, bind_rows)

这篇关于有效地合并R中的数据帧列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆