r不同长度和不同密钥的数据帧列表中的多个联接 [英] r multiple joins from list of data frames of differing lengths and differing keys
问题描述
比方说,我有以下数据帧列表:
Let's say I've got this list of data frames:
library(tidyverse)
df_list <- list(data.frame(cheese = c("ex","ok","bd"),
cheese_val = c(3:1),
stringsAsFactors = F),
data.frame(egg = c("great","good","bad", "eww"),
egg_val = c(4:1),
stringsAsFactors = F),
data.frame(milk = c("good","bad"),
milk_val = c(2:1),
stringsAsFactors = F))
我有这个核心数据集:
core_dat <- data.frame(cheese = c("ex","ok","ok", "bd", "ok"),
egg = c("great", "bad", "bad", "eww", "great"),
milk = c("good", "good", "good", "bad", "good"),
stringsAsFactors = F)
我想分别将core_dat
与df_list
的每个元素连接在一起.
I'd like to get core_dat
joined individually with each element of df_list
.
然后我尝试了这个:
for(i in 1:length(df_list)) {
gg<-core_dat %>%
left_join(df_list[[i]], by = names(df_list[[i]][1]), copy = T)
}
运行但仅将联接应用于milk
列,这样core_dat
中唯一的附加列是milk_val
,但我希望也看到cheese_val
和egg_val
.
which ran but only applied the join to the milk
column such that the only additional column in core_dat
was milk_val
but I expected to see cheese_val
, and egg_val
too.
我怀疑这里比for循环还有更多合适的选项,我正在寻找建议.请注意,与这个小例子相比,我的实际数据集具有更多的df.
I suspect there are more appropriate options than a for loop here and I am looking for suggestions. Note that my actual data set has many more df's than this small example.
我不应该期望所得的数据帧(在本例中为gg
)总共包含6列(3个标准名称+ 3个带有"val"后缀的列),看起来像这样: >
I should not that I expect the resulting data frame, in this case gg
, to contain 6 columns total (3 standard name + 3 with "val" suffix) such that it looks like printed version of this:
data.frame(cheese = c("ex","ok","ok", "bd", "ok"),
egg = c("great", "bad", "bad", "eww", "great"),
milk = c("good", "good", "good", "bad", "good"),
chees_val = c(3, 2, 2, 1, 2),
egg_val = c(4, 2, 2, 1, 4),
milk_val = c(2, 2, 2, 1, 2))
我在这里看到了许多多重联接"答案,但没有一个与我在这里要完成的工作完全一致(不同的键列,不同的数据长度).
I've seen many "multiple joins" answers here but none that quite line up with what I'm trying to accomplish here (differing key columns, differing lengths of data).
推荐答案
您可以使用map
获取已连接数据帧的列表,然后使用reduce
将它们全部连接在一起.
You can use map
to get a list of joined data frames, then use reduce
to join them all together.
map(df_list, right_join, rownames_to_column(core_dat)) %>%
reduce(full_join)
# Joining, by = "cheese"
# Joining, by = "egg"
# Joining, by = "milk"
# Joining, by = c("cheese", "rowname", "egg", "milk")
# Joining, by = c("cheese", "rowname", "egg", "milk")
# cheese cheese_val rowname egg milk egg_val milk_val
# 1 ex 3 1 great good 4 2
# 2 ok 2 2 bad good 2 2
# 3 ok 2 3 bad good 2 2
# 4 bd 1 4 eww bad 1 1
# 5 ok 2 5 great good 4 2
这篇关于r不同长度和不同密钥的数据帧列表中的多个联接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!