tidyverse pivot_longer 几组列,但避免中间 mutate_wider 步骤 [英] tidyverse pivot_longer several sets of columns, but avoid intermediate mutate_wider steps
问题描述
我有以下数据
dat <- data.frame(id = c("A", "B", "C"),
Q1r1_pepsi = c(1,0,1),
Q1r1_cola = c(0,0,1),
Q1r2_pepsi = c(1,1,1),
Q1r2_cola = c(0,1,1),
stringsAsFactors = FALSE)
其中 Q1r1 和 Q1r2 是调查中的评级问题,百事可乐和可乐是被评级的品牌.所以我对两个品牌(百事可乐、可乐)有两个评分(r1 和 r2):
where Q1r1 and Q1r2 are rating questions in a survey and pepsi and cola are the brands being rated. So I have two ratings (r1 and r2) for two brands (pepsi, cola):
id Q1r1_c1 Q1r1_c2 Q1r2_c1 Q1r2_c2
"A" 1 0 1 0
"B" 0 0 1 1
"C" 1 1 1 1
(附带问题:如何格式化 SO 帖子,以便它正确包含在 R 控制台中调用 dat
时获得的格式良好的输出?)
(Side question: how do I format a SO post so that it correctly contains the nicely formatted output that I would get when calling dat
in the R Console?)
为了分析数据,我需要重新调整(旋转)数据,以便行表示唯一的评级-品牌对.因此,预期的结果是:
To analyze the data I need to reshape (pivot) the data so that rows indicate unique rating-brand pairs. Thus, the expected outcome would be:
id brand Q1r1 Q1r2
"A" "pepsi" 1 1
"A" "cola" 0 0
"B" "pepsi" 0 1
"B" "cola" 0 1
"C" "pepsi" 1 1
"C" "cola" 1 1
目前,我总是结合使用 pivot_longer
和 pivot_wider
,但我希望我可以通过 pivoting_longer 直接得到这个结果,而无需做中间步骤:>
Currently, I always do a combination of pivot_longer
and pivot_wider
, but I was hoping that I can directly get this result by pivoting_longer without doing the intermediate step:
library(tidyverse)
dat_long <- dat %>%
pivot_longer(cols = starts_with("Q1")) %>%
separate(name, into = c("item", "brand"), remove = FALSE)
dat_wide <- dat_long %>%
pivot_wider(id_cols = c(id, brand),
names_from = item,
values_from = value)
在当前的示例中,执行此中间步骤仍然可以,但在其他不太干净的示例中会变得令人厌烦,例如假设我的列没有以 Q1r1_c1, Q1r1_c2, Q1r2_c1, Q1r2_c2
的良好结构命名,而是 Q4, Q5, Q8r1, Q8r2
地图所在的位置分别在 Q4 和 Q8r1 之间,以及 Q5/Q8r2 之间.
With this current example it's still ok to do this intermediate step, but it gets tiresome in other less clean examples, e.g. suppose my columns weren't named in a nice structure with Q1r1_c1, Q1r1_c2, Q1r2_c1, Q1r2_c2
, but instead would be Q4, Q5, Q8r1, Q8r2
where the map would be between Q4 and Q8r1, and Q5/Q8r2, respectively.
推荐答案
您可以使用:
tidyr::pivot_longer(dat, cols = -id,
names_to = c('.value', 'brand'),
names_sep = "_")
# id brand Q1r1 Q1r2
# <chr> <chr> <dbl> <dbl>
#1 A pepsi 1 1
#2 A cola 0 0
#3 B pepsi 0 1
#4 B cola 0 1
#5 C pepsi 1 1
#6 C cola 1 1
这篇关于tidyverse pivot_longer 几组列,但避免中间 mutate_wider 步骤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!