使用 tidyr::separate 拆分多列的整洁方法 [英] Tidy method to split multiple columns using tidyr::separate
本文介绍了使用 tidyr::separate 拆分多列的整洁方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个像这样的数据框:
I have a data frame like so:
df <- structure(list(A = c("3 of 5", "1 of 2", "1 of 3", "1 of 3",
"3 of 4", "2 of 7"), B = c("2 of 2", "2 of 4", "0 of 1", "0 of 0",
"0 of 0", "0 of 0"), C = c("10 of 21", "3 of 14", "11 of 34",
"10 of 35", "16 of 53", "17 of 62"), D = c("0 of 0", "0 of 0",
"0 of 0", "0 of 0", "0 of 0", "0 of 0"), E = c("8 of 16", "3 of 15",
"10 of 32", "6 of 28", "13 of 49", "9 of 48")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -6L))
df
|A |B |C |D |E |
|:------|:------|:--------|:------|:--------|
|3 of 5 |2 of 2 |10 of 21 |0 of 0 |8 of 16 |
|1 of 2 |2 of 4 |3 of 14 |0 of 0 |3 of 15 |
|1 of 3 |0 of 1 |11 of 34 |0 of 0 |10 of 32 |
|1 of 3 |0 of 0 |10 of 35 |0 of 0 |6 of 28 |
|3 of 4 |0 of 0 |16 of 53 |0 of 0 |13 of 49 |
|2 of 7 |0 of 0 |17 of 62 |0 of 0 |9 of 48 |
我想将每一列分成 2 列,剩下的就是这样的:
I want to split each column into 2, leaving me with something like this:
|A_attempted |A_landed |B_attempted |B_landed |C_attempted |C_landed |D_attempted |D_landed |E_attempted |E_landed |
|:-----------|:--------|:-----------|:--------|:-----------|:--------|:-----------|:--------|:-----------|:--------|
|3 |5 |2 |2 |10 |21 |0 |0 |8 |16 |
|1 |2 |2 |4 |3 |14 |0 |0 |3 |15 |
|1 |3 |0 |1 |11 |34 |0 |0 |10 |32 |
|1 |3 |0 |0 |10 |35 |0 |0 |6 |28 |
|3 |4 |0 |0 |16 |53 |0 |0 |13 |49 |
|2 |7 |0 |0 |17 |62 |0 |0 |9 |48 |
我目前使用的方法是这样的:
The method I am using so far is this:
df %>%
separate(A, sep = " of ", remove = T, into = c("A_attempted", "A_landed")) %>%
separate(B, sep = " of ", remove = T, into = c("B_attempted", "B_landed")) %>%
separate(C, sep = " of ", remove = T, into = c("C_attempted", "C_landed")) %>%
separate(D, sep = " of ", remove = T, into = c("D_attempted", "D_landed")) %>%
separate(E, sep = " of ", remove = T, into = c("E_attempted", "E_landed"))
考虑到我有 15 个变量,这不是很好.我更喜欢使用 map
Which is not great considering I have 15 variables. I would prefer a solution using map
这里有一个答案:在多列上应用 tidyr::separate 但是使用不推荐使用的函数
There is an answer here: Apply tidyr::separate over multiple columns but that uses deprecated functions
推荐答案
可以尝试:
library(tidyverse)
names(df) %>%
map(
function(x)
df %>%
select(x) %>%
separate(x,
into = paste0(x, c("_attempted", "_landed")),
sep = " of ")
) %>%
bind_cols()
输出:
# A tibble: 6 x 10
A_attempted A_landed B_attempted B_landed C_attempted C_landed D_attempted D_landed E_attempted E_landed
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 3 5 2 2 10 21 0 0 8 16
2 1 2 2 4 3 14 0 0 3 15
3 1 3 0 1 11 34 0 0 10 32
4 1 3 0 0 10 35 0 0 6 28
5 3 4 0 0 16 53 0 0 13 49
6 2 7 0 0 17 62 0 0 9 48
正如 OP 所建议的,我们确实可以使用 map_dfc
避免最后一步:
As OP suggests we can indeed avoid the last step with map_dfc
:
names(df) %>%
map_dfc(~ df %>%
select(.x) %>%
separate(.x,
into = paste0(.x, c("_attempted", "_landed")),
sep = " of ")
)
这篇关于使用 tidyr::separate 拆分多列的整洁方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文