使用带有 expand.grid 的 purr 循环遍历 t.test 的公式,同时调节另一个变量 [英] using purr with expand.grid to loop over formulas for a t.test while conditioning on another variable
问题描述
我想让一些 purrr 代码更简洁.我有一个带有一个因变量 (y) 和 4 个自变量 (x1, x2, x3, x4) 的 df.我还有一个条件变量,它有 2 个级别(z,为零或 1).我想运行 8 次 t 检验:y ~ x1 [z==0], y ~ x2 [z==0] ... y ~ x1 [z==1], y ~ x2 [z==1] 等.我想返回一个单独的数据帧,其中整齐的测试堆叠在一起.
I'd like to make some purrr code more concise. I have a df with one dependent variable (y) and 4 independent variables (x1, x2, x3, x4). I also have one conditioning variable that takes 2 levels (z, is either zero or 1). I'd like to run 8 t-tests: y ~ x1 [z==0], y ~ x2 [z==0] ... y ~ x1 [z==1], y ~ x2 [z==1] etc. I'd like to return a single dataframe with the tidy tests stacked on top of each other.
我真的想将这种方法推广到更多的预测变量组合,所以构建一个公式并使用 expand.grid 似乎是最好的方法.我还想使用 dplyr/purrr/broom 的组合来做到这一点.以下工作,但我想知道,有没有办法把所有东西都放在一个管道里?
I really want to generalize this method for more combinations of predictors, so building a formula and using expand.grid seems like the best way to go. I'd also like to use a combination of dplyr/purrr/broom to do this. The following works, but I'm wondering, is there a way to get everything into a single pipe?
library(tidyverse)
library(broom)
df <- data.frame(y = rnorm(100), x1 = sample(0:1, 100, replace = TRUE), x2 = sample(0:1, 100, replace = TRUE), x3 = sample(0:1, 100, replace = TRUE), x4 = sample(0:1, 100, replace = TRUE), z = sample(0:1, 100, replace = TRUE))
ivs <- c("x1", "x2", "x3", "x4")
med <- c(0, 1)
models <- expand.grid(ivs, med) %>% mutate(frm = paste0("y ~ ", Var1))
formula <- models$frm
cond <- models$Var2
models <- map2_df(formula, cond, ~tidy(t.test(as.formula(.x), data=df[df$z==.y,])))
我想知道为什么,例如,以下不起作用?
I'm wondering why, for instance, doesn't the following work?
models <- expand.grid(ivs, med) %>% mutate(frm = paste0("y ~ ", Var1)) %>% map2_df(.$frm, .$Var2, ~tidy(t.test(as.formula(.x), data=df[df$z==.y,])))
推荐答案
您的第一个代码有效,因为 formula
&cond
被 map2_df
视为列表.但是,当您将它们放入创建数据框的 pipe
中时,情况并非如此.你不能做 .x$frm
或 .x$Var2
.
Your 1st code worked because formula
& cond
were considered lists by map2_df
. However it wasn't the case when you put them in the pipe
that created a data frame. You cannot do .x$frm
or .x$Var2
.
要使其工作,您可以使用 pmap_df
循环遍历 pipe
内创建的数据框的每一行,并使用 pmap_df
引用列的顺序代码>..1、..2、..3 等等
To make it work, you can use pmap_df
to loop through each row of the data frame created inside the pipe
and refer to the order of the columns by using ..1, ..2, ..3
and so on
library(tidyverse)
library(broom)
df <- data.frame(y = rnorm(100), x1 = sample(0:1, 100, replace = TRUE),
x2 = sample(0:1, 100, replace = TRUE),
x3 = sample(0:1, 100, replace = TRUE),
x4 = sample(0:1, 100, replace = TRUE),
z = sample(0:1, 100, replace = TRUE))
ivs <- c("x1", "x2", "x3", "x4")
med <- c(0, 1)
models <- expand.grid(ivs, med) %>%
mutate(frm = paste0("y ~ ", Var1))
formula <- models$frm
cond <- models$Var2
models <- map2_df(formula, cond, ~ tidy(t.test(as.formula(.x), data = df[df$z == .y, ])))
# using pmap to loop through the columns of the data frame (essentially list of columns)
models2 <- expand.grid(ivs, med) %>%
mutate(frm = paste0("y ~ ", Var1)) %>%
pmap_df(., ~ tidy(t.test(as.formula(..3), data = df[df$z == ..2, ])))
models2
#> estimate estimate1 estimate2 statistic p.value parameter
#> 1 0.2039970 -0.002158780 -0.20615579 0.6372003 0.52724597 44.68250
#> 2 -0.4488714 -0.341650359 0.10722106 -1.4646944 0.15052718 41.56782
#> 3 -0.3016148 -0.246980034 0.05463477 -0.9189260 0.36427350 35.86492
#> 4 0.2601315 -0.004184604 -0.26431615 0.8668975 0.39031605 47.94586
#> 5 -0.2303647 -0.099116913 0.13124775 -0.8420942 0.40422649 44.61732
#> 6 0.5992558 0.385767243 -0.21348854 2.0517453 0.04957898 28.21589
#> 7 0.5027880 0.243581778 -0.25920622 1.9502349 0.05803462 40.84076
#> 8 -0.2735021 -0.101687239 0.17181481 -0.9498541 0.34888013 34.04935
#> conf.low conf.high method alternative
#> 1 -0.440936247 0.8489303 Welch Two Sample t-test two.sided
#> 2 -1.067524893 0.1697821 Welch Two Sample t-test two.sided
#> 3 -0.967373762 0.3641441 Welch Two Sample t-test two.sided
#> 4 -0.343220972 0.8634841 Welch Two Sample t-test two.sided
#> 5 -0.781476516 0.3207472 Welch Two Sample t-test two.sided
#> 6 0.001181137 1.1973304 Welch Two Sample t-test two.sided
#> 7 -0.017929386 1.0235054 Welch Two Sample t-test two.sided
#> 8 -0.858637554 0.3116335 Welch Two Sample t-test two.sided
identical(models, models2)
#> [1] TRUE
由 reprex 包 (v0.2.0) 于 2018 年 3 月 25 日创建.
Created on 2018-03-25 by the reprex package (v0.2.0).
这篇关于使用带有 expand.grid 的 purr 循环遍历 t.test 的公式,同时调节另一个变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!