使用group_by并行进行wilcox.test并进行汇总 [英] Parallel wilcox.test using group_by and summarise

查看:538
本文介绍了使用group_by并行进行wilcox.test并进行汇总的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

必须有一种R-ly方法,可以使用group_by并行调用多个观测值的wilcox.test.我花了很多时间来阅读此书,但仍然找不到对wilcox.test的调用来完成这项工作.下面的示例数据和代码,使用magrittr管道和summarize().

There must be an R-ly way to call wilcox.test over multiple observations in parallel using group_by. I've spent a good deal of time reading up on this but still can't figure out a call to wilcox.test that does the job. Example data and code below, using magrittr pipes and summarize().

library(dplyr)
library(magrittr)

# create a data frame where x is the dependent variable, id1 is a category variable (here with five levels), and id2 is a binary category variable used for the two-sample wilcoxon test
df <- data.frame(x=abs(rnorm(50)),id1=rep(1:5,10), id2=rep(1:2,25))

# make sure piping and grouping are called correctly, with "sum" function as a well-behaving example function 
df %>% group_by(id1) %>% summarise(s=sum(x))
df %>% group_by(id1,id2) %>% summarise(s=sum(x))

# make sure wilcox.test is called correctly 
wilcox.test(x~id2, data=df, paired=FALSE)$p.value

# yet, cannot call wilcox.test within pipe with summarise (regardless of group_by). Expected output is five p-values (one for each level of id1)
df %>% group_by(id1) %>% summarise(w=wilcox.test(x~id2, data=., paired=FALSE)$p.value) 
df %>% summarise(wilcox.test(x~id2, data=., paired=FALSE))

# even specifying formula argument by name doesn't help
df %>% group_by(id1) %>% summarise(w=wilcox.test(formula=x~id2, data=., paired=FALSE)$p.value)

越野车调用会产生此错误:

The buggy calls yield this error:

Error in wilcox.test.formula(c(1.09057358373486, 
    2.28465932554436, 0.885617572657959,  : 'formula' missing or incorrect

感谢您的帮助;希望对其他有类似问题的人也有帮助.

Thanks for your help; I hope it will be helpful to others with similar questions as well.

推荐答案

您可以使用基数R(尽管结果很繁琐):

You can do this with base R (although the result is a cumbersome list):

by(df, df$id1, function(x) { wilcox.test(x~id2, data=x, paired=FALSE)$p.value })

或使用dplyr:

ddply(df, .(id1), function(x) { wilcox.test(x~id2, data=x, paired=FALSE)$p.value })

  id1        V1
1   1 0.3095238
2   2 1.0000000
3   3 0.8412698
4   4 0.6904762
5   5 0.3095238

这篇关于使用group_by并行进行wilcox.test并进行汇总的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆