如何在单行数据集中使用spread和group_by [英] How do I use spread and group_by on a single row dataset
本文介绍了如何在单行数据集中使用spread和group_by的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个表单数据框,其中有多个相同的IDs
和dates
条目.
我需要将此数据集分组为单个行,但是在使用收集,散布和分组方面存在一些问题.
I have a form dataframe that has multiple entries for same IDs
and dates
.
I need to group this dataset to a single row, but I have some problems with the use of gather, spread and group.
# surveys dataset
user_id <- c(100, 100, 100, 200, 200, 200)
int_id <- c(1000, 1000, 1000, 2000, 2000, 2000)
fech <- c('01/01/2019', '01/01/2019','01/01/2019','02/01/2019','02/01/2019','02/01/2019')
order <- c(1,2,3,1,2,3)
questions <- c('question1','question2','question3','question1','question2','question3')
answers <- c('answ1','answ2','answ3','answ1','answ2','answ3')
survey.data <- data.frame(user_id, int_id, fech, order, questions,answers)
> survey.data
user_id int_id fech order questions answers
1 100 1000 01/01/2019 1 question1 answ1
2 100 1000 01/01/2019 2 question2 answ2
3 100 1000 01/01/2019 3 question3 answ3
4 200 2000 02/01/2019 1 question1 answ1
5 200 2000 02/01/2019 2 question2 answ2
6 200 2000 02/01/2019 3 question3 answ3
我使用点差将一些列带到行:
I use spread to take some columns to rows:
survey.data %>%
spread(key= questions, value=answers) %>%
group_by(user_id,int_id, fech) %>%
select(-order)
并获得以下信息:
# A tibble: 6 x 6
user_id int_id fech question1 question2 question3
* <dbl> <dbl> <fctr> <fctr> <fctr> <fctr>
1 100 1000 01/01/2019 answ1 NA NA
2 100 1000 01/01/2019 NA answ2 NA
3 100 1000 01/01/2019 NA NA answ3
4 200 2000 02/01/2019 answ1 NA NA
5 200 2000 02/01/2019 NA answ2 NA
6 200 2000 02/01/2019 NA NA answ3
我试图对结果数据集进行分组,但总是得到6行而不是2行.
I tried to group the resulting dataset, but always get 6 rows instead of 2.
我期望以下几点:
user_id int_id fech question1 question2 question3
100 1000 01/01/2019 answ1 answ2 answ3
200 2000 02/01/2019 answ1 answ2 answ3
我的问题与这非常相似!
但是我不知道如何使用它.
But I can´t figure out how to use make it.
推荐答案
我发现(我认为)另一种可能的解决方案:
I found (i think) another possible solution:
survey.data %>%
select(-order) %>%
dcast(... ~ questions)
´´´
这篇关于如何在单行数据集中使用spread和group_by的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文