如何将 wilcox.test 应用于 R 中的整个数据帧? [英] How to apply the wilcox.test to a whole dataframe in R?

查看:28
本文介绍了如何将 wilcox.test 应用于 R 中的整个数据帧?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中一个分组因子(第一列)具有多个级别(两个以上)和多个包含数据的列.我想将 wilcox.test 应用于整个日期框架,以将每个组变量与其他变量进行比较.我该怎么做?

I have a data frame with one grouping factor (the first column) with multiple levels (more than two) and several columns with data. I want to apply the wilcox.test to the whole date frame to compare the each group variables with the others. How can I do this?

更新:我知道 wilcox.test 只会测试两组之间的差异,而我的数据框包含三个.但我更感兴趣的是如何做到这一点,而不是使用什么测试.很可能会删除一组,但我还没有决定,所以我想测试所有变体.

UPDATE: I know that the wilcox.test will only test for difference between two groups and my data frame contains three. But I am interested more in how to do this, than what test to use. Most likely that one group will be removed, but I have not decided yet on that, so I want to test all variants.

这是一个示例:

structure(list(group = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), var1 = c(9.3, 
9.05, 7.78, 7.11, 7.14, 8.12, 7.5, 7.84, 7.8, 7.52, 8.84, 6.98, 
6.1, 6.89, 6.5, 7.5, 7.8, 5.5, 6.61, 7.65, 7.68), var2 = c(11L, 
11L, 10L, 1L, 3L, 7L, 11L, 11L, 11L, 11L, 4L, 1L, 1L, 1L, 2L, 
2L, 1L, 4L, 8L, 8L, 1L), var3 = c(7L, 11L, 3L, 7L, 11L, 2L, 11L, 
5L, 11L, 11L, 5L, 11L, 11L, 2L, 9L, 9L, 3L, 8L, 11L, 11L, 2L), 
    var4 = c(11L, 11L, 11L, 11L, 6L, 11L, 11L, 11L, 10L, 7L, 
    11L, 2L, 11L, 3L, 11L, 11L, 6L, 11L, 1L, 11L, 11L), var5 = c(11L, 
    1L, 2L, 2L, 11L, 11L, 1L, 10L, 2L, 11L, 1L, 3L, 11L, 11L, 
    8L, 8L, 11L, 11L, 11L, 2L, 9L)), .Names = c("group", "var1", 
"var2", "var3", "var4", "var5"), class = "data.frame", row.names = c(NA, 
-21L))

更新

感谢大家的回答!

推荐答案

更新我的答案以跨列工作

Updating my answer to work across columns

test.fun <- function(dat, col) { 

 c1 <- combn(unique(dat$group),2)
 sigs <- list()
 for(i in 1:ncol(c1)) {
    sigs[[i]] <- wilcox.test(
                   dat[dat$group == c1[1,i],col],
                   dat[dat$group == c1[2,i],col]
                 )
    }
    names(sigs) <- paste("Group",c1[1,],"by Group",c1[2,])

 tests <- data.frame(Test=names(sigs),
                    W=unlist(lapply(sigs,function(x) x$statistic)),
                    p=unlist(lapply(sigs,function(x) x$p.value)),row.names=NULL)

 return(tests)
}


tests <- lapply(colnames(dat)[-1],function(x) test.fun(dat,x))
names(tests) <- colnames(dat)[-1]
# tests <- do.call(rbind, tests) reprints as data.frame

# This solution is not "slow" and outperforms the other answers significantly: 
system.time(
  rep(
   tests <- lapply(colnames(dat)[-1],function(x) test.fun(dat,x)),10000
  )
)

#   user  system elapsed 
#  0.056   0.000   0.053 

结果:

tests

$var1
                Test  W          p
1 Group 1 by Group 2 28 0.36596737
2 Group 1 by Group 3 39 0.05927406
3 Group 2 by Group 3 38 0.27073136

$var2
                Test    W         p
1 Group 1 by Group 2 19.0 0.8205958
2 Group 1 by Group 3 36.5 0.1159945
3 Group 2 by Group 3 40.5 0.1522726

$var3
                Test    W         p
1 Group 1 by Group 2 13.0 0.2425786
2 Group 1 by Group 3 23.5 1.0000000
3 Group 2 by Group 3 41.0 0.1261647

$var4
                Test  W         p
1 Group 1 by Group 2 26 0.4323470
2 Group 1 by Group 3 30 0.3729664
3 Group 2 by Group 3 29 0.9479518

$var5
                Test    W         p
1 Group 1 by Group 2 24.0 0.7100968
2 Group 1 by Group 3 19.0 0.5324295
3 Group 2 by Group 3 17.5 0.2306609

这篇关于如何将 wilcox.test 应用于 R 中的整个数据帧?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆