同时合并一个列表中的多个data.frames [英] Simultaneously merge multiple data.frames in a list

查看:26
本文介绍了同时合并一个列表中的多个data.frames的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含许多要合并的 data.frames 的列表.这里的问题是每个 data.frame 在行数和列数方面都不同,但它们都共享关键变量(我称之为 "var1""var2" 在下面的代码中).如果 data.frames 在列方面是相同的,我只能 rbind,为此 plyr 的 rbind.fill 可以完成这项工作,但这些数据并非如此.

I have a list of many data.frames that I want to merge. The issue here is that each data.frame differs in terms of the number of rows and columns, but they all share the key variables (which I've called "var1" and "var2" in the code below). If the data.frames were identical in terms of columns, I could merely rbind, for which plyr's rbind.fill would do the job, but that's not the case with these data.

因为 merge 命令只对 2 个 data.frames 起作用,我转向互联网寻求想法.我从 here 得到了这个,它有效在 R 2.7.2 中完美,这就是我当时所拥有的:

Because the merge command only works on 2 data.frames, I turned to the Internet for ideas. I got this one from here, which worked perfectly in R 2.7.2, which is what I had at the time:

merge.rec <- function(.list, ...){
    if(length(.list)==1) return(.list[[1]])
    Recall(c(list(merge(.list[[1]], .list[[2]], ...)), .list[-(1:2)]), ...)
}

我会像这样调用函数:

df <- merge.rec(my.list, by.x = c("var1", "var2"), 
                by.y = c("var1", "var2"), all = T, suffixes=c("", ""))

但是在 2.7.2 之后的任何 R 版本中,包括 2.11 和 2.12,这段代码都会失败并出现以下错误:

But in any R version after 2.7.2, including 2.11 and 2.12, this code fails with the following error:

Error in match.names(clabs, names(xi)) : 
  names do not match previous names

(顺便说一句,我看到其他对此错误的引用其他地方 没有解决办法).

(Incidently, I see other references to this error elsewhere with no resolution).

有什么办法可以解决这个问题吗?

Is there any way to solve this?

推荐答案

另一个专门问的问题 如何在 R 中使用 dplyr 执行多个左连接.这个问题被标记为这个问题的重复,所以我在这里回答,使用下面的 3 个示例数据框:

Another question asked specifically how to perform multiple left joins using dplyr in R . The question was marked as a duplicate of this one so I answer here, using the 3 sample data frames below:

x <- data.frame(i = c("a","b","c"), j = 1:3, stringsAsFactors=FALSE)
y <- data.frame(i = c("b","c","d"), k = 4:6, stringsAsFactors=FALSE)
z <- data.frame(i = c("c","d","a"), l = 7:9, stringsAsFactors=FALSE)

2018 年 6 月更新:我将答案分为三个部分,分别代表执行合并的三种不同方式.如果您已经在使用 tidyverse 包,您可能想要使用 purrr 方式.为了在下面进行比较,您将找到使用相同示例数据集的基本 R 版本.

Update June 2018: I divided the answer in three sections representing three different ways to perform the merge. You probably want to use the purrr way if you are already using the tidyverse packages. For comparison purposes below, you'll find a base R version using the same sample dataset.

1) 使用 purrr 包中的 reduce 加入它们:

1) Join them with reduce from the purrr package:

purrr 包提供了一个语法简洁的 reduce 函数:

The purrr package provides a reduce function which has a concise syntax:

library(tidyverse)
list(x, y, z) %>% reduce(left_join, by = "i")
#  A tibble: 3 x 4
#  i       j     k     l
#  <chr> <int> <int> <int>
# 1 a      1    NA     9
# 2 b      2     4    NA
# 3 c      3     5     7

您还可以执行其他连接,例如 full_joininner_join:

You can also perform other joins, such as a full_join or inner_join:

list(x, y, z) %>% reduce(full_join, by = "i")
# A tibble: 4 x 4
# i       j     k     l
# <chr> <int> <int> <int>
# 1 a     1     NA     9
# 2 b     2     4      NA
# 3 c     3     5      7
# 4 d     NA    6      8

list(x, y, z) %>% reduce(inner_join, by = "i")
# A tibble: 1 x 4
# i       j     k     l
# <chr> <int> <int> <int>
# 1 c     3     5     7


2) dplyr::left_join() 带基 R Reduce():


2) dplyr::left_join() with base R Reduce():

list(x,y,z) %>%
    Reduce(function(dtf1,dtf2) left_join(dtf1,dtf2,by="i"), .)

#   i j  k  l
# 1 a 1 NA  9
# 2 b 2  4 NA
# 3 c 3  5  7


3) 基础 R merge() 和基础 R Reduce():


3) Base R merge() with base R Reduce():

为了进行比较,这里是基于 Charles 答案的左连接的基本 R 版本.

And for comparison purposes, here is a base R version of the left join based on Charles's answer.

 Reduce(function(dtf1, dtf2) merge(dtf1, dtf2, by = "i", all.x = TRUE),
        list(x,y,z))
#   i j  k  l
# 1 a 1 NA  9
# 2 b 2  4 NA
# 3 c 3  5  7

这篇关于同时合并一个列表中的多个data.frames的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆