tidyr 点差不聚合数据 [英] tidyr spread does not aggregate data
本文介绍了tidyr 点差不聚合数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下数据:
> data <- data.frame(unique=1:9, grouping=rep(c('a', 'b', 'c'), each=3), value=sample(1:30, 9))
> data
unique grouping value
1 1 a 15
2 2 a 21
3 3 a 26
4 4 b 8
5 5 b 6
6 6 b 4
7 7 c 17
8 8 c 1
9 9 c 3
我想创建一个看起来像这样的表:
I would like to create a table that looks like this:
a b c
1 15 8 17
2 21 6 1
3 26 6 3
我正在使用 tidyr::spread 并且没有得到正确的结果:
I am using tidyr::spread and not getting the correct result:
> data %>% spread(grouping, value)
unique a b c
1 1 15 NA NA
2 2 21 NA NA
3 3 26 NA NA
4 4 NA 8 NA
5 5 NA 6 NA
6 6 NA 4 NA
7 7 NA NA 17
8 8 NA NA 1
9 9 NA NA 3
或
> data %>% select(grouping, value) %>% spread(grouping, value)
Error: Duplicate identifiers for rows (1, 2, 3), (4, 5, 6), (7, 8, 9)
当一组 (c) 的长度与其他组不同时,有没有办法做到这一点?
Is there a way to do this also when one group (c) has a different length than the others?
推荐答案
我们需要创建一个序列列来避免重复标识符行错误.
We need to create a sequence column to avoid the duplicate identifiers row Error.
library(tidyr)
library(dplyr)
data %>%
group_by(grouping) %>%
mutate(id = row_number()) %>%
select(-unique) %>%
spread(grouping, value) %>%
select(-id)
# a b c
# (int) (int) (int)
#1 15 8 17
#2 21 6 1
#3 26 4 3
这篇关于tidyr 点差不聚合数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文