使用带有 expand.grid 的 purr 循环遍历 t.test 的公式,同时调节另一个变量 [英] using purr with expand.grid to loop over formulas for a t.test while conditioning on another variable

查看:48
本文介绍了使用带有 expand.grid 的 purr 循环遍历 t.test 的公式,同时调节另一个变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想让一些 purrr 代码更简洁.我有一个带有一个因变量 (y) 和 4 个自变量 (x1, x2, x3, x4) 的 df.我还有一个条件变量,它有 2 个级别(z,为零或 1).我想运行 8 次 t 检验:y ~ x1 [z==0], y ~ x2 [z==0] ... y ~ x1 [z==1], y ~ x2 [z==1] 等.我想返回一个单独的数据帧,其中整齐的测试堆叠在一起.

I'd like to make some purrr code more concise. I have a df with one dependent variable (y) and 4 independent variables (x1, x2, x3, x4). I also have one conditioning variable that takes 2 levels (z, is either zero or 1). I'd like to run 8 t-tests: y ~ x1 [z==0], y ~ x2 [z==0] ... y ~ x1 [z==1], y ~ x2 [z==1] etc. I'd like to return a single dataframe with the tidy tests stacked on top of each other.

我真的想将这种方法推广到更多的预测变量组合,所以构建一个公式并使用 expand.grid 似乎是最好的方法.我还想使用 dplyr/purrr/broom 的组合来做到这一点.以下工作,但我想知道,有没有办法把所有东西都放在一个管道里?

I really want to generalize this method for more combinations of predictors, so building a formula and using expand.grid seems like the best way to go. I'd also like to use a combination of dplyr/purrr/broom to do this. The following works, but I'm wondering, is there a way to get everything into a single pipe?

library(tidyverse)
library(broom)

df <- data.frame(y = rnorm(100), x1 = sample(0:1, 100, replace = TRUE), x2 = sample(0:1, 100, replace = TRUE), x3 = sample(0:1, 100, replace = TRUE), x4 = sample(0:1, 100, replace = TRUE), z = sample(0:1, 100, replace = TRUE))

ivs <- c("x1", "x2", "x3", "x4")
med <- c(0, 1)

models <- expand.grid(ivs, med) %>% mutate(frm = paste0("y ~ ", Var1)) 

formula <- models$frm   
cond <- models$Var2

models <-  map2_df(formula, cond, ~tidy(t.test(as.formula(.x), data=df[df$z==.y,])))

我想知道为什么,例如,以下不起作用?

I'm wondering why, for instance, doesn't the following work?

models <- expand.grid(ivs, med) %>% mutate(frm = paste0("y ~ ", Var1)) %>% map2_df(.$frm, .$Var2, ~tidy(t.test(as.formula(.x), data=df[df$z==.y,])))

推荐答案

您的第一个代码有效,因为 formula &condmap2_df 视为列表.但是,当您将它们放入创建数据框的 pipe 中时,情况并非如此.你不能做 .x$frm.x$Var2.

Your 1st code worked because formula & cond were considered lists by map2_df. However it wasn't the case when you put them in the pipe that created a data frame. You cannot do .x$frm or .x$Var2.

要使其工作,您可以使用 pmap_df 循环遍历 pipe 内创建的数据框的每一行,并使用 pmap_df 引用列的顺序代码>..1、..2、..3 等等

To make it work, you can use pmap_df to loop through each row of the data frame created inside the pipe and refer to the order of the columns by using ..1, ..2, ..3 and so on

library(tidyverse)
library(broom)

df <- data.frame(y = rnorm(100), x1 = sample(0:1, 100, replace = TRUE), 
                 x2 = sample(0:1, 100, replace = TRUE), 
                 x3 = sample(0:1, 100, replace = TRUE), 
                 x4 = sample(0:1, 100, replace = TRUE), 
                 z = sample(0:1, 100, replace = TRUE))

ivs <- c("x1", "x2", "x3", "x4")
med <- c(0, 1)

models <- expand.grid(ivs, med) %>% 
  mutate(frm = paste0("y ~ ", Var1)) 

formula <- models$frm   
cond <- models$Var2

models <-  map2_df(formula, cond, ~ tidy(t.test(as.formula(.x), data = df[df$z == .y, ])))

# using pmap to loop through the columns of the data frame (essentially list of columns)
models2 <- expand.grid(ivs, med) %>% 
  mutate(frm = paste0("y ~ ", Var1)) %>% 
  pmap_df(., ~ tidy(t.test(as.formula(..3), data = df[df$z == ..2, ])))
models2

#>     estimate    estimate1   estimate2  statistic    p.value parameter
#> 1  0.2039970 -0.002158780 -0.20615579  0.6372003 0.52724597  44.68250
#> 2 -0.4488714 -0.341650359  0.10722106 -1.4646944 0.15052718  41.56782
#> 3 -0.3016148 -0.246980034  0.05463477 -0.9189260 0.36427350  35.86492
#> 4  0.2601315 -0.004184604 -0.26431615  0.8668975 0.39031605  47.94586
#> 5 -0.2303647 -0.099116913  0.13124775 -0.8420942 0.40422649  44.61732
#> 6  0.5992558  0.385767243 -0.21348854  2.0517453 0.04957898  28.21589
#> 7  0.5027880  0.243581778 -0.25920622  1.9502349 0.05803462  40.84076
#> 8 -0.2735021 -0.101687239  0.17181481 -0.9498541 0.34888013  34.04935
#>       conf.low conf.high                  method alternative
#> 1 -0.440936247 0.8489303 Welch Two Sample t-test   two.sided
#> 2 -1.067524893 0.1697821 Welch Two Sample t-test   two.sided
#> 3 -0.967373762 0.3641441 Welch Two Sample t-test   two.sided
#> 4 -0.343220972 0.8634841 Welch Two Sample t-test   two.sided
#> 5 -0.781476516 0.3207472 Welch Two Sample t-test   two.sided
#> 6  0.001181137 1.1973304 Welch Two Sample t-test   two.sided
#> 7 -0.017929386 1.0235054 Welch Two Sample t-test   two.sided
#> 8 -0.858637554 0.3116335 Welch Two Sample t-test   two.sided

identical(models, models2)
#> [1] TRUE

reprex 包 (v0.2.0) 于 2018 年 3 月 25 日创建.

Created on 2018-03-25 by the reprex package (v0.2.0).

这篇关于使用带有 expand.grid 的 purr 循环遍历 t.test 的公式,同时调节另一个变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆