`group_by`并将分组级别保留为嵌套数据框的名称 [英] `group_by` and keep grouping levels as nested data frame's name
问题描述
我正在使用以下代码执行数据分析的几个步骤.我想将分组因子的级别保留为嵌套数据框的名称,并使用这些名称来标识过程中的每个步骤,而不是使用默认枚举[[1]],[[2]],[[3]]等.我不明白我收到的错误.请查看如何修复我的代码.
I am doing several steps of data analysis with the following code. I want to keep my grouping factor's levels as the nested data frames' names and uses those names to identify each of the steps along the way, instead of using the default enumeration [[1]], [[2]], [[3]], etc. I don't understand the error I got. Please see how I can fix my code.
library(dplyr)
library(purrr)
library(emmeans)
data("warpbreaks")
wb_emm <- warpbreaks %>%
group_by(tension) %>%
setNames(unique(.x$tension)) %>%
nest() %>%
mutate(models=map(data,~glm(breaks~wool,data=.x))) %>%
mutate(jt = map(models, ~emmeans::joint_tests(.x, data = .x$data))) %>%
mutate(means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data))) %>%
mutate(p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))))
Error in unique(.x$tension) : object '.x' not found
我最初做了 group_by(tension)%>%setNames(unique(tension))
,并在 unique(tension)中出错:未找到对象张力"
我也尝试了 split(.$ tension)
,但它与 nest()
I originally did group_by(tension) %>% setNames(unique(tension))
and got Error in unique(tension) : object 'tension' not found
I also tried split(.$tension)
but it is conflicted with nest()
但是张力
级别是清晰的.
unique(warpbreaks$tension)
[1] L M H
Levels: L M H
在没有 setNames(unique(.x $ tension))%>%
步骤的情况下,代码运行良好.
The code runs well without the setNames(unique(.x$tension)) %>%
step.
wb_emm$p_cont
[[1]]
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
A - B 16.3 6.87 Inf 2.87 29.8 2.378 0.0174
Confidence level used: 0.95
[[2]]
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
A - B -4.78 4.27 Inf -13.1 3.59 -1.119 0.2630
Confidence level used: 0.95
[[3]]
contrast estimate SE df asymp.LCL asymp.UCL z.ratio p.value
A - B 5.78 3.79 Inf -1.66 13.2 1.523 0.1277
Confidence level used: 0.95
谢谢.
更新:从下面Ronak Shah提供的第二个解决方案中,我尝试了 diamonds
,但名称没有变化.该代码以 ungroup()%>%
或 ungroup%>%
运行.
Update: from the second solution provided by Ronak Shah below, I tried on diamonds
but the names were unchanged. The code runs with either ungroup()%>%
or ungroup%>%
.
diamonds %>%
group_by(cut) %>%
nest() %>%
ungroup %>%
mutate(models=map(data,~glm(price ~ x + y + z + clarity + color,data=.x)),
jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
means=map(models,~emmeans::emmeans(.x,"color",data=.x$data)),
p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))),
across(models:p_cont, stats::setNames, .$cut)) -> diamond_result
> diamond_result$jt
[[1]]
model term df1 df2 F.ratio p.value
x 1 Inf 611.626 <.0001
y 1 Inf 2.914 0.0878
z 1 Inf 100.457 <.0001
clarity 7 Inf 800.852 <.0001
color 6 Inf 256.796 <.0001
推荐答案
您需要在 map
步骤中添加 setNames
:
library(tidyverse)
warpbreaks %>%
group_by(tension) %>%
nest() %>%
ungroup %>%
mutate(models=map(data,~glm(breaks~wool,data=.x)),
jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data)),
p_cont = setNames(map(means,
~emmeans::contrast(.x, "pairwise",infer = c(T,T))),.$tension))
如果要命名所有列表输出,请使用 across
:
If you want to name all the list output use across
:
warpbreaks %>%
group_by(tension) %>%
nest() %>%
ungroup %>%
mutate(models=map(data,~glm(breaks~wool,data=.x)),
jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data)),
p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))),
across(models:p_cont, setNames, .$tension)) -> result
result$jt
#$L
# model term df1 df2 F.ratio p.value
# wool 1 Inf 5.653 0.0174
#$M
# model term df1 df2 F.ratio p.value
# wool 1 Inf 1.253 0.2630
#$H
# model term df1 df2 F.ratio p.value
# wool 1 Inf 2.321 0.1277
这篇关于`group_by`并将分组级别保留为嵌套数据框的名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!