如何通过使用名称连接列表内的data.frame? [英] How to concatenate data.frame inside lists by using names?

查看:93
本文介绍了如何通过使用名称连接列表内的data.frame?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须导入1,000个以上的excel文件,每个excel都包含多张工作表(有些具有相同的工作表名称,有些具有不同的工作表名称).

I have to import over 1,000 excel files, and each excel contains multiple sheets (some have the same sheet name and some have different sheet names).

下面举一个小例子

games <- data.frame(index = c(1,2,3), player = c('John', 'Sam', 'Mary'))
weather <- data.frame(index = c(1,2,3), temperature = c('hot', 'cold', 'rainy'))
cars <- data.frame(index = c(1,2,3), car = c('honda', 'toyota','bmw'))
list1 <- list(games, weather, cars)
names(list1) <-  c('games', 'weather', 'cars')

games <- data.frame(index = c(1,2,3), player = c('AA', 'BB', 'CC'))
weather <- data.frame(index = c(1,2,3), temperature = c('cold', 'rainy', 'hot'))
sport <- data.frame(index = c(1,2,3), interest = c('swim', 'soccer', 'rugby'))
list2 <- list(games, weather, sport)
names(list2) <-  c('games', 'weather', 'sport')
list3 <- list(games, weather)
names(list3) <-  c('games', 'weather')

rm(games, sport, weather, cars)  # clean envir from unneeded stuff

我正在寻找通过使用列表名称来组合列表的方法.我尝试使用merge()mapply(),但是它们没有返回我想要的内容

I am looking for the way to combine lists by using lists' name. I have tried to use merge() and mapply(), but they did not return what I wanted

我想要的回报如下:

   $`games`
# A tibble: 6 x 2
  index player
  <dbl> <chr> 
1     1 John  
2     2 Sam   
3     3 Mary  
4     1 AA    
5     2 BB    
6     3 CC    

$weather
# A tibble: 6 x 2
  index temperature
  <dbl> <chr>      
1     1 hot        
2     2 cold       
3     3 rainy      
4     1 cold       
5     2 rainy      
6     3 hot        

$cars
# A tibble: 3 x 2
  index car   
  <dbl> <chr> 
1     1 honda 
2     2 toyota
3     3 bmw   

$sport
  index interest
1     1     swim
2     2   soccer
3     3    rugby

我遇到过这样的情况:list2中有一个data.frame sport(不在list1中)

I have encountered with the case when there is a data.frame sport in list2 (not in list1)

推荐答案

您可以使用purrr帮助操作列表.我仅添加stringAsFactors=FALSE,以便可以绑定data.frame.如果您已经使用tibble,则不会有问题.

You can use purrr to help manipulate the list. I add the stringAsFactors=FALSE only so that I could bind the data.frame. If you already use tibble, you won't have the issue.

  • 我创建了一个列表列表.
  • transpose更改列表以按名称将元素重新分组.基本上, x [[1]] [[2]]等效于transpose(x)[[2]] [[1]]
  • 我使用map遍历该列表,并使用dplyr::bind_rows来获得最终的提示.
  • I create a list of the lists.
  • transpose change the list to regroup the element by name. Basically, x[[1]][[2]] is equivalent to transpose(x)[[2]][[1]]
  • I use map to iterate through the list, and dplyr::bind_rows to get the resulting tibble.
options(stringsAsFactors = FALSE)
games <- data.frame(index = c(1,2,3), player = c('John', 'Sam', 'Mary'))
weather <- data.frame(index = c(1,2,3), temperature = c('hot', 'cold', 'rainy'))
cars <- data.frame(index = c(1,2,3), car = c('honda', 'toyota','bmw'))
list1 <- list(games, weather, cars)
names(list1) <-  c('games', 'weather', 'cars')

games <- data.frame(index = c(1,2,3), player = c('AA', 'BB', 'CC'))
weather <- data.frame(index = c(1,2,3), temperature = c('cold', 'rainy', 'hot'))
list2 <- list(games, weather)
names(list2) <-  c('games', 'weather')

library(purrr)
list(list1, list2) %>%
  # regroup named element together
  transpose() %>%
  # bind the df together
  map(dplyr::bind_rows)
#> $games
#>   index player
#> 1     1   John
#> 2     2    Sam
#> 3     3   Mary
#> 4     1     AA
#> 5     2     BB
#> 6     3     CC
#> 
#> $weather
#>   index temperature
#> 1     1         hot
#> 2     2        cold
#> 3     3       rainy
#> 4     1        cold
#> 5     2       rainy
#> 6     3         hot
#> 
#> $cars
#>   index    car
#> 1     1  honda
#> 2     2 toyota
#> 3     3    bmw

reprex程序包(v0.2.1)创建于2018-11-04

如果第一个列表未包含所需的所有元素,则需要在转置中提供.names参数.请参阅help("transpose", package = "purrr"). 我为此建立了一个例子.

If the first list does not contain all the elements you want, you need to provide the .names argument in transpose. See help("transpose", package = "purrr"). I build an example for that.

options(stringsAsFactors = FALSE)
games <- data.frame(index = c(1,2,3), player = c('John', 'Sam', 'Mary'))
weather <- data.frame(index = c(1,2,3), temperature = c('hot', 'cold', 'rainy'))
list1 <- list(games = games, weather = weather)

games <- data.frame(index = c(1,2,3), player = c('AA', 'BB', 'CC'))
weather <- data.frame(index = c(1,2,3), temperature = c('cold', 'rainy', 'hot'))
cars <- data.frame(index = c(1,2,3), car = c('honda', 'toyota','bmw'))
list2 <- list(games = games, weather = weather, cars = cars)

library(purrr)
all_list <- list(list1, list2)
all_names <- all_list %>% map(names) %>% reduce(union)
list(list1, list2) %>%
  # regroup named element together
  transpose(.names = all_names) %>%
  # bind the df together
  map(dplyr::bind_rows)
#> $games
#>   index player
#> 1     1   John
#> 2     2    Sam
#> 3     3   Mary
#> 4     1     AA
#> 5     2     BB
#> 6     3     CC
#> 
#> $weather
#>   index temperature
#> 1     1         hot
#> 2     2        cold
#> 3     3       rainy
#> 4     1        cold
#> 5     2       rainy
#> 6     3         hot
#> 
#> $cars
#>   index    car
#> 1     1  honda
#> 2     2 toyota
#> 3     3    bmw

reprex程序包(v0.2.1)创建于2018-11-04

这篇关于如何通过使用名称连接列表内的data.frame?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆