用于随机假设检验 r 的 lapply 而不是 for 循环 [英] lapply instead of for loop for randomised hypothesis testing r

查看:23
本文介绍了用于随机假设检验 r 的 lapply 而不是 for 循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样的 df:

I have a df that looks something like this like this:

set.seed(42)
ID <- sample(1:30, 100, rep=T) 
Trait <- sample(0:1, 100, rep=T) 
Year <- sample(1992:1999, 100, rep=T)
df <- cbind(ID, Trait, Year)
df <- as.data.frame(df)

ID 是个体生物体,性状是表型的存在/不存在,Year 是进行观察的年份.

Where ID is an individual organism, trait is a presence/absence of a phenotype and Year is the year an observation was made.

如果特征在个体之间是随机的,我想建模,就像这样

I would like to model if trait is random between individuals, something like this

library(MCMCglmm) 
m <- MCMCglmm(Trait ~ ID, random = ~ Year, data = df, family = "categorical")

现在,想改组 Trait 列并运行 x 排列,以检查我观察到的均值和 CI 是否超出了随机预期的范围.我可以用 for 循环来做到这一点,但我宁愿使用 tidyverse 解决方案.我已经读过 lapply 是一个更好的选择(?),但我正在努力寻找一个我可以遵循的足够具体的演练.

Now, would like to shuffle the Trait column and run x permutations, to check if my observed mean and CI fall outside of what's expected from random. I could do this with a for loop, but I'd rather use a tidyverse solution. I've read that lapply is a bette(?) alternative, but I am struggling to find a specific enough walk-through that I can follow.

我很感激这里提供的任何建议.

I'd appreciate any advice offered here.

干杯!

杰米

推荐答案

EDIT 10 月 10 日. 清理了代码,并在下面的每条评论中添加了代码,为您提供一个井井有条的 tibble\dataframe

EDIT October 10th. Cleaned up the code and per comment below added the code to give you back a nice organized tibble\dataframe

### decide how many shuffles you want and name them
### in an orderly fashion for the output

shuffles <- 1:10
names(shuffles) <- paste0("shuffle_", shuffles)

library(MCMCglmm)
library(dplyr)
library(tibble)
library(purrr)

ddd <- purrr::map(shuffles,
                  ~ df %>%
                     mutate(Trait = sample(Trait)) %>%
                     MCMCglmm(fixed = Trait ~ ID,
                              random = ~ Year,
                              data = .,
                              family = "categorical",
                              verbose = FALSE)) %>%
   purrr::map( ~ tibble::as_tibble(summary(.x)$solutions, rownames = "model_term")) %>%
   dplyr::bind_rows(., .id = 'shuffle')
ddd
#> # A tibble: 20 x 7
#>    shuffle    model_term  post.mean `l-95% CI` `u-95% CI` eff.samp pMCMC
#>    <chr>      <chr>           <dbl>      <dbl>      <dbl>    <dbl> <dbl>
#>  1 shuffle_1  (Intercept)  112.         6.39     233.       103.   0.016
#>  2 shuffle_1  ID            -6.31     -13.5       -0.297    112.   0.014
#>  3 shuffle_2  (Intercept)   24.9      -72.5      133.       778.   0.526
#>  4 shuffle_2  ID            -0.327     -6.33       5.33     849.   0.858
#>  5 shuffle_3  (Intercept)    4.39     -77.3       87.4      161.   0.876
#>  6 shuffle_3  ID             1.04      -3.84       5.99     121.   0.662
#>  7 shuffle_4  (Intercept)    7.71     -79.0      107.       418.   0.902
#>  8 shuffle_4  ID             0.899     -4.40       6.57     408.   0.694
#>  9 shuffle_5  (Intercept)   30.4      -62.4      144.       732.   0.51 
#> 10 shuffle_5  ID            -0.644     -6.61       4.94     970.   0.866
#> 11 shuffle_6  (Intercept)  -45.5     -148.        42.7      208.   0.302
#> 12 shuffle_6  ID             4.73      -0.211     11.6       89.1  0.058
#> 13 shuffle_7  (Intercept)  -16.2     -133.        85.9      108.   0.696
#> 14 shuffle_7  ID             2.47      -2.42      10.3       47.8  0.304
#> 15 shuffle_8  (Intercept)    0.568      0.549      0.581      6.60 0.001
#> 16 shuffle_8  ID            -0.0185    -0.0197    -0.0168     2.96 0.001
#> 17 shuffle_9  (Intercept)   -6.95    -112.        92.2      452.   0.886
#> 18 shuffle_9  ID             2.07      -3.30       8.95     370.   0.476
#> 19 shuffle_10 (Intercept)   43.8      -57.0      159.       775.   0.396
#> 20 shuffle_10 ID            -1.36      -7.44       5.08     901.   0.62

您的原始数据

set.seed(42)
ID <- sample(1:30, 100, rep=T) 
Trait <- sample(0:1, 100, rep=T) 
Year <- sample(1992:1999, 100, rep=T)
df <- cbind(ID, Trait, Year)
df <- as.data.frame(df)

这篇关于用于随机假设检验 r 的 lapply 而不是 for 循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆