根据值将一列逗号分隔的数字拆分为多列 [英] Split column of comma-separated numbers into multiple columns based on value
问题描述
我在数据框中有一个列 f
,我希望根据该列中的值将其分布到多个列中。例如:
I have a column f
in my dataframe that I would like to spread into multiple columns based on the values in that column. For example:
df <- structure(list(f = c(NA, "18,17,10", "12,8", "17,11,6", "18",
"12", "12", NA, "17,11", "12")), .Names = "f", row.names = c(NA,
10L), class = "data.frame")
df
# f
# 1 <NA>
# 2 18,17,10
# 3 12,8
# 4 17,11,6
# 5 18
# 6 12
# 7 12
# 8 <NA>
# 9 17,11
# 10 12
如何拆分列 f
分成多列,指示该行中的数字。我对这样的东西感兴趣:
How would I split column f
into multiple columns indicating the numbers in the row. I'm interested in something like this:
6 8 10 11 12 17 18
1 0 0 0 0 0 0 0
2 0 0 1 0 0 1 1
3 0 1 0 0 1 0 0
4 1 0 0 1 0 1 0
5 0 0 0 0 0 0 1
6 0 0 0 0 1 0 0
7 0 0 0 0 1 0 0
8 0 0 0 0 0 0 0
9 0 0 0 1 0 1 0
10 0 0 0 0 1 0 0
我在想我可以在 f
列上使用 unique
来根据不同的数字创建单独的列,然后执行 grepl
确定特定数字是否在 f
列中,但我想知道是否有更好的方法。与<$ c $ tidyr 包中的价差
或分隔
类似。
I'm thinking I could useunique
on the f
column to create the seperate columns based on the different numbers and then do a grepl
to determine if the specific number is in column f
but I was wondering if there was a better way. Something similar to spread
or separate
in the tidyr
package.
推荐答案
使用 tidyr :: separate_rows
的解决方案如下:
library(tidyverse)
df %>% mutate(ind = row_number()) %>%
separate_rows(f, sep=",") %>%
mutate(f = ifelse(is.na(f),0, f)) %>%
count(ind, f) %>%
spread(f, n, fill = 0) %>%
select(-2) %>% as.data.frame()
# ind 10 11 12 17 18 6 8
# 1 1 0 0 0 0 0 0 0
# 2 2 1 0 0 1 1 0 0
# 3 3 0 0 1 0 0 0 1
# 4 4 0 1 0 1 0 1 0
# 5 5 0 0 0 0 1 0 0
# 6 6 0 0 1 0 0 0 0
# 7 7 0 0 1 0 0 0 0
# 8 8 0 0 0 0 0 0 0
# 9 9 0 1 0 1 0 0 0
# 10 10 0 0 1 0 0 0 0
这篇关于根据值将一列逗号分隔的数字拆分为多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!