dplyr 使用 t.test 汇总多列 [英] dplyr summarise multiple columns using t.test

查看:18
本文介绍了dplyr 使用 t.test 汇总多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有可能以某种方式针对同一分类变量对多个变量进行 t.test,而无需按如下方式对数据集进行重构?

Is it possible somehow to do a t.test over multiple variables against the same categorical variable without going through a reshaping of the dataset as follows?

data(mtcars)
library(dplyr)
library(tidyr)
j <- mtcars %>% gather(var, val, disp:qsec)
t <- j %>% group_by(var) %>% do(te = t.test(val ~ vs, data = .))

t %>% summarise(p = te$p.value)

我试过使用

mtcars %>% summarise_each_(funs = (t.test(. ~ vs))$p.value, vars = disp:qsec)

mtcars %>% summarise_each_(funs = (t.test(. ~ vs))$p.value, vars = disp:qsec)

但它抛出一个错误.

奖励:t %>% summarise(p = te$p.value) 如何也包含分组变量的名称?

Bonus: How can t %>% summarise(p = te$p.value) also include the name of the grouping variable?

推荐答案

在与@aosmith 和@Misha 进行了所有讨论之后,这里是一种方法.正如@aosmith 在他/她的评论中所写,您想要执行以下操作.

After all discussions with @aosmith and @Misha, here is one approach. As @aosmith wrote in his/her comments, You want to do the following.

mtcars %>%
    summarise_each(funs(t.test(.[vs == 0], .[vs == 1])$p.value), vars = disp:qsec)

#         vars1        vars2      vars3        vars4        vars5
#1 2.476526e-06 1.819806e-06 0.01285342 0.0007281397 3.522404e-06

vs 是 0 或 1(组).如果您想在变量中的两个组之间运行 t 检验(例如,dips),似乎您需要按照@aosmith 的建议对数据进行子集化.我想说谢谢你的贡献.

vs is either 0 or 1 (group). If you want to run a t-test between the two groups in a variable (e.g., dips), it seems that you need to subset data as @aosmith suggested. I would like to say thank you for the contribution.

我最初的建议适用于另一种情况,在这种情况下,您只需比较两列.这是示例数据和代码.

What I originally suggested works in another situation, in which you simply compare two columns. Here is sample data and codes.

foo <- data.frame(country = "Iceland",
                  year = 2014,
                  id = 1:30,
                  A = sample.int(1e5, 30, replace = TRUE),
                  B = sample.int(1e5, 30, replace = TRUE),
                  C = sample.int(1e5, 30, replace = TRUE),
                  stringsAsFactors = FALSE)

如果您想对 A-C 和 B-C 组合进行 t 检验,以下是一种方法.

If you want to run t-tests for the A-C, and B-C combination, the following would be one way.

foo2 <- foo %>%
        summarise_each(funs(t.test(., C, pair = TRUE)$p.value), vars = A:B) 

names(foo2) <- colnames(foo[4:5])

#          A         B
#1 0.2937979 0.5316822

这篇关于dplyr 使用 t.test 汇总多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆