使用dplyr创建t.test表? [英] Create t.test table with dplyr?

查看:43
本文介绍了使用dplyr创建t.test表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我的数据看起来像这样:

Suppose I have data that looks like this:

set.seed(031915)
myDF <- data.frame(
  Name= rep(c("A", "B"), times = c(10,10)),
  Group = rep(c("treatment", "control", "treatment", "control"), times = c(5,5,5,5)),
  X = c(rnorm(n=5,mean = .05, sd = .001), rnorm(n=5,mean = .02, sd = .001),
        rnorm(n=5,mean = .08, sd = .02), rnorm(n=5,mean = .03, sd = .02))
)

我想创建一个t.test表,其中一行代表"A",一行代表"B"

I want to create a t.test table with a row for "A" and one for "B"

我可以编写自己的函数来做到这一点

I can write my own function that does that:

ttestbyName <- function(Name) {
  b <- t.test(myDF$X[myDF$Group == "treatment" & myDF$Name==Name], 
              myDF$X[myDF$Group == "control" & myDF$Name==Name], 
              conf.level = 0.90)
  dataNameX <- data.frame(Name = Name,
                          treatment = round(b$estimate[[1]], digits = 4),
                          control = round(b$estimate[[2]], digits = 4),
                          CI = paste('(',round(b$conf.int[[1]], 
                                                  digits = 4),', ',
                                        round(b$conf.int[[2]], 
                                              digits = 4), ')',
                                        sep=""),
                          pvalue = round(b$p.value, digits = 4),
                          ntreatment = nrow(myDF[myDF$Group == "treatment" & myDF$Name==Name,]),
                          ncontrol = nrow(myDF[myDF$Group == "control" & myDF$Name==Name,]))
}
library(parallel)
Test_by_Name <- mclapply(unique(myDF$Name), ttestbyName)
Test_by_Name <- do.call("rbind", Test_by_Name)

输出看起来像这样:

 Name treatment control               CI pvalue ntreatment ncontrol
1    A    0.0500  0.0195 (0.0296, 0.0314) 0.0000          5        5
2    B    0.0654  0.0212  (0.0174, 0.071) 0.0161          5        5

我想知道是否有使用dplyr的更干净的方法.我曾考虑过使用groupby,但我有点迷茫.

I'm wondering if there is a cleaner way of doing this with dplyr. I thought about using groupby, but I'm a little lost.

谢谢!

推荐答案

不是很多清洁器,但是有一个改进:

Not much cleaner, but here's an improvement:

library(dplyr)

ttestbyName <- function(myName) {
  bt <- filter(myDF, Group=="treatment", Name==myName)
  bc <- filter(myDF, Group=="control", Name==myName)

  b <- t.test(bt$X, bc$X, conf.level=0.90)

  dataNameX <- data.frame(Name = myName,
                      treatment = round(b$estimate[[1]], digits = 4),
                      control = round(b$estimate[[2]], digits = 4),
                      CI = paste('(',round(b$conf.int[[1]], 
                                           digits = 4),', ',
                                 round(b$conf.int[[2]], 
                                       digits = 4), ')',
                                 sep=""),
                      pvalue = round(b$p.value, digits = 4),
                      ntreatment = nrow(bt),  # changes only in
                      ncontrol = nrow(bc))    # these 2 nrow() args
}

您应该真正用 data.table 中的 rbindlist 替换 do.call 函数:

You should really replace the do.call function with rbindlist from data.table:

library(data.table)
Test_by_Name <- lapply(unique(myDF$Name), ttestbyName)
Test_by_Name <- rbindlist(Test_by_Name)

,或者甚至更好的是,使用%>%管道:

or, even better, use the %>% pipes:

Test_by_Name <- myDF$Name %>% 
                unique %>% 
                lapply(., ttestbyName) %>% 
                rbindlist

> Test_by_Name
 Name treatment control               CI pvalue ntreatment ncontrol
1:    A    0.0500  0.0195 (0.0296, 0.0314) 0.0000          5        5
2:    B    0.0654  0.0212  (0.0174, 0.071) 0.0161          5        5

这篇关于使用dplyr创建t.test表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆