将列表列直接取消嵌套到多列中 [英] Unnest a list column directly into several columns
问题描述
我可以将列表列直接取消嵌套到 n 列中吗?
可以假设列表是规则的,所有元素的长度相等.
如果我有一个字符向量而不是列表列,我可以tidyr::separate
.我可以tidyr::unnest
,但是我们需要另一个辅助变量来tidyr::spread
.我错过了一个明显的方法吗?
示例数据:
图书馆(tibble)df1 <- 数据帧(gr = c('a', 'b', 'c'),值 = 列表(1:2, 3:4, 5:6))
<块引用>
# tibble: 3 x 2gr值<chr><列表>1 a<int [2]>2 b<int [2]>3 c<int [2]>
目标:
df2 <- data_frame(gr = c('a', 'b', 'c'),V1 = c(1, 3, 5),V2 = c(2, 4, 6))
<块引用>
# tibble: 3 x 3格 V1 V2<chr><dbl><dbl>1 1. 2.2 b 3. 4.3 c 5. 6.
当前方法:
unnest(df1) %>%group_by(gr) %>%变异(r = paste0('V',row_number()))%>%传播(r,值)
使用 tidyr 1.0.0 你可以做到:
library(tidyr)df1 <- 小标题(gr = c('a', 'b', 'c'),值 = 列表(1:2, 3:4, 5:6))unnest_wider(df1,值)#>新名称:#>* `` ->...1#>* `` ->...2#>新名称:#>* `` ->...1#>* `` ->...2#>新名称:#>* `` ->...1#>* `` ->...2#># 小费:3 x 3#>克 ...1 ...2#><chr><int><int>#>1 1 2#>2 b 3 4#>3 c 5 6
由 reprex 包 (v0.3.0) 于 2019 年 9 月 14 日创建上>
这里的输出很冗长,因为水平未嵌套的元素(向量元素)没有命名,unnest_wider
不想默默猜测.
我们可以预先命名它们以避免它:
df1 %>%dplyr::mutate(values = purrr::map(values, setNames, c(V1",V2"))) %>%unnest_wider(值)#># 小费:3 x 3#>格 V1 V2#><chr><int><int>#>1 1 2#>2 b 3 4#>3 c 5 6
或者只是使用 suppressMessages()
或 purrr::quietly()
Can I unnest a list column directly into n columns?
The list can be assumed to regular, with all elements being of equal length.
If instead of a list column I would have a character vector, I could tidyr::separate
. I can tidyr::unnest
, but we need another helper variable to be able to tidyr::spread
. Am I missing an obvious method?
Example data:
library(tibble)
df1 <- data_frame(
gr = c('a', 'b', 'c'),
values = list(1:2, 3:4, 5:6)
)
# A tibble: 3 x 2 gr values <chr> <list> 1 a <int [2]> 2 b <int [2]> 3 c <int [2]>
Goal:
df2 <- data_frame(
gr = c('a', 'b', 'c'),
V1 = c(1, 3, 5),
V2 = c(2, 4, 6)
)
# A tibble: 3 x 3 gr V1 V2 <chr> <dbl> <dbl> 1 a 1. 2. 2 b 3. 4. 3 c 5. 6.
Current method:
unnest(df1) %>%
group_by(gr) %>%
mutate(r = paste0('V', row_number())) %>%
spread(r, values)
with tidyr 1.0.0 you can do :
library(tidyr)
df1 <- tibble(
gr = c('a', 'b', 'c'),
values = list(1:2, 3:4, 5:6)
)
unnest_wider(df1, values)
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> # A tibble: 3 x 3
#> gr ...1 ...2
#> <chr> <int> <int>
#> 1 a 1 2
#> 2 b 3 4
#> 3 c 5 6
Created on 2019-09-14 by the reprex package (v0.3.0)
The output is verbose here because the elements that were unnested horizontally (the vector elements) were not named, and unnest_wider
doesn't want to guess silently.
We can name them beforehand to avoid it :
df1 %>%
dplyr::mutate(values = purrr::map(values, setNames, c("V1","V2"))) %>%
unnest_wider(values)
#> # A tibble: 3 x 3
#> gr V1 V2
#> <chr> <int> <int>
#> 1 a 1 2
#> 2 b 3 4
#> 3 c 5 6
Or just use suppressMessages()
or purrr::quietly()
这篇关于将列表列直接取消嵌套到多列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!