tidyr:每个键收集两个值 [英] tidyr: Gathering two values per key
本文介绍了tidyr:每个键收集两个值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据集,其中每个变量的均值和标准差作为列,但我想将其转换为长"格式:
I have a dataset with the mean and sd of each variable as columns, but I want to convert it into "long" format as so:
library(tidyverse)
iris %>%
group_by(Species) %>%
summarize_all(list(mean = mean, sd = sd))
#> # A tibble: 3 x 9
#> Species Sepal.Length_me~ Sepal.Width_mean Petal.Length_me~
#> <fct> <dbl> <dbl> <dbl>
#> 1 setosa 5.01 3.43 1.46
#> 2 versic~ 5.94 2.77 4.26
#> 3 virgin~ 6.59 2.97 5.55
#> # ... with 5 more variables: Petal.Width_mean <dbl>,
#> # Sepal.Length_sd <dbl>, Sepal.Width_sd <dbl>, Petal.Length_sd <dbl>,
#> # Petal.Width_sd <dbl>
# Desired output:
#
# tribble(~Species, ~Variable, ~Mean, ~SD
# #-------------------------------
# ... )
我觉得 tidyr::gather
在这里使用会很好,但是,我不确定每个键有两个值的语法如何工作.或者我可能需要使用两个收集和列绑定它们?
I feel like tidyr::gather
would be good to use here, however, I am not sure how the syntax would work for having two values per key. Or perhaps I need to use two gathers and column bind them?
推荐答案
要转换您的 post-summarise_all
数据,您可以执行以下操作
To convert your post-summarise_all
data you can do the following
df %>%
gather(key, val, -Species) %>%
separate(key, into = c("Variable", "metric"), sep = "_") %>%
spread(metric, val)
## A tibble: 12 x 4
# Species Variable mean sd
# <fct> <chr> <dbl> <dbl>
# 1 setosa Petal.Length 1.46 0.174
# 2 setosa Petal.Width 0.246 0.105
# 3 setosa Sepal.Length 5.01 0.352
# 4 setosa Sepal.Width 3.43 0.379
# 5 versicolor Petal.Length 4.26 0.470
# 6 versicolor Petal.Width 1.33 0.198
# 7 versicolor Sepal.Length 5.94 0.516
# 8 versicolor Sepal.Width 2.77 0.314
# 9 virginica Petal.Length 5.55 0.552
#10 virginica Petal.Width 2.03 0.275
#11 virginica Sepal.Length 6.59 0.636
#12 virginica Sepal.Width 2.97 0.322
但它实际上更快&从一开始就将数据从宽转换为长
But it's actually faster & shorter to transform the data from wide to long right from the start
iris %>%
gather(Variable, val, -Species) %>%
group_by(Species, Variable) %>%
summarise(Mean = mean(val), SD = sd(val))
## A tibble: 12 x 4
## Groups: Species [?]
# Species Variable Mean SD
# <fct> <chr> <dbl> <dbl>
# 1 setosa Petal.Length 1.46 0.174
# 2 setosa Petal.Width 0.246 0.105
# 3 setosa Sepal.Length 5.01 0.352
# 4 setosa Sepal.Width 3.43 0.379
# 5 versicolor Petal.Length 4.26 0.470
# 6 versicolor Petal.Width 1.33 0.198
# 7 versicolor Sepal.Length 5.94 0.516
# 8 versicolor Sepal.Width 2.77 0.314
# 9 virginica Petal.Length 5.55 0.552
#10 virginica Petal.Width 2.03 0.275
#11 virginica Sepal.Length 6.59 0.636
#12 virginica Sepal.Width 2.97 0.322
这篇关于tidyr:每个键收集两个值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文