dplyr使用t.test汇总多列 [英] dplyr summarise multiple columns using t.test

查看:83
本文介绍了dplyr使用t.test汇总多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以通过某种方式针对同一个类别变量对多个变量进行t.test,而无需进行如下所示的数据集重塑?

Is it possible somehow to do a t.test over multiple variables against the same categorical variable without going through a reshaping of the dataset as follows?

data(mtcars)
library(dplyr)
library(tidyr)
j <- mtcars %>% gather(var, val, disp:qsec)
t <- j %>% group_by(var) %>% do(te = t.test(val ~ vs, data = .))

t %>% summarise(p = te$p.value)

我尝试使用


mtcars%>%summarise_each_(funs =(t.test(。〜vs))$ p.value,vars = disp:qsec)

mtcars %>% summarise_each_(funs = (t.test(. ~ vs))$p.value, vars = disp:qsec)

,但会引发错误。

奖金:如何 t%> %summarise(p = te $ p.value)还包括分组变量的名称吗?

Bonus: How can t %>% summarise(p = te$p.value) also include the name of the grouping variable?

推荐答案

与@aosmith和@Misha进行所有讨论之后,这是一种方法。正如@aosmith在他/她的评论中所写,您想执行以下操作。

After all discussions with @aosmith and @Misha, here is one approach. As @aosmith wrote in his/her comments, You want to do the following.

mtcars %>%
    summarise_each(funs(t.test(.[vs == 0], .[vs == 1])$p.value), vars = disp:qsec)

#         vars1        vars2      vars3        vars4        vars5
#1 2.476526e-06 1.819806e-06 0.01285342 0.0007281397 3.522404e-06

vs是0或1(组)。如果您想在变量的两个组之间进行t检验(例如dips),似乎需要按照@aosmith的建议对数据进行子集化。我想对您的贡献表示感谢。

vs is either 0 or 1 (group). If you want to run a t-test between the two groups in a variable (e.g., dips), it seems that you need to subset data as @aosmith suggested. I would like to say thank you for the contribution.

我最初建议的方法在另一种情况下有效,您只需比较两列即可。这是示例数据和代码。

What I originally suggested works in another situation, in which you simply compare two columns. Here is sample data and codes.

foo <- data.frame(country = "Iceland",
                  year = 2014,
                  id = 1:30,
                  A = sample.int(1e5, 30, replace = TRUE),
                  B = sample.int(1e5, 30, replace = TRUE),
                  C = sample.int(1e5, 30, replace = TRUE),
                  stringsAsFactors = FALSE)

如果要对AC和BC组合运行t检验,则以下方法是一种。

If you want to run t-tests for the A-C, and B-C combination, the following would be one way.

foo2 <- foo %>%
        summarise_each(funs(t.test(., C, pair = TRUE)$p.value), vars = A:B) 

names(foo2) <- colnames(foo[4:5])

#          A         B
#1 0.2937979 0.5316822

这篇关于dplyr使用t.test汇总多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆