R,如何在数据帧的循环中进行方差分析? [英] R, how to do anova in a loop over dataframe?

查看:50
本文介绍了R,如何在数据帧的循环中进行方差分析?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想做 2ways anova,并存储 p 值而不是 tukey hsd,但是我的初始表有问题.我并不总是有完整的数据,所以并不总是可以执行 anova,我不知道如何做到这一点,所以我的脚本运行,而不是跳过不完整的数据并进一步运行.我的数据如下所示:

I would like to do 2ways anova, and store the p value and than do tukey hsd, but i have a problem with the initial table. Not always I have full data, so not always it is possible to perfors anova, I dont know how to do this so my script runs, than skip the not full data and runns further. my data looks like this:

https://filebin.net/w5cfuwztae7yk747

在链接中有两个种质的示例,但在实际数据中有 3013 个种质,其中一些没有所有光照条件或所有基因型

in the link there is example with two Accessions, but in real data there is 3013 accessions and some of them dont have all light conditions or all genotypes

67822  AT2G41680        f      HL_f_Dejan58 1.240108e+06    HL      AT2G41680 f
 70136  AT2G41680        f     HL_f_Dejan_61 3.384010e+06    HL    AT2G41680 f
 72450  AT2G41680     ntrc HL_ntrc_ Dejan_62 1.410768e+05    HL AT2G41680 ntrc
 74764  AT2G41680     ntrc  HL_ntrc_Dejan_66 5.642197e+00    HL AT2G41680 ntrc
77078  AT2G41680     ntrc   HL_ntrc_Dejan65 3.921952e+05    HL AT2G41680 ntrc
 78997  AT2G41680       WT    LL_WT_Dejan_41 1.016001e+07    LL   AT2G41680 WT
 81433  AT2G41680       WT    LL_WT_Dejan_43 9.320892e+06    LL   AT2G41680 WT
 83869  AT2G41680       WT    LL_WT_Dejan_49 8.560308e+06    LL   AT2G41680 WT

有4种基因型,四种光照条件我正在尝试做这样的事情:

there is 4 genotypes, and four light conditions I am trying to do something like this:

AOV<- data.frame()
IDs<- unique(Dejan_all_new_norm$Accession)
for (i in 1 : length(IDs)){
 temp<-Dejan_all_new_norm[(Dejan_all_new_norm$Accession)==IDs[i],]
    aov2<-aov(value ~ genotype + Light + genotype:Light, data = temp)
    AOV <- rbind(as.character(unique(IDs[i])),aov2,AOV)
}

所以我想对每个基因(Accession)进行子集化,而不是做方差分析,但在此之后我想做 tukey 有这样的事情:

so i want to subset each gene (Accession) and than do ANOVA, but after this i want do tukey to have something like this:

$`genotype:Light`
                    diff       lwr        upr     p adj
m:FL-f:FL        -7324259.81 -16715470  2066950.5 0.3486778
ntrc:FL-f:FL      1662873.54  -7728337 11054083.9 0.9999998
WT:FL-f:FL       -5219263.59 -13913835  3475307.7 0.7927417
f:HL-f:FL        -4936680.12 -13871535  3998174.3 0.8796738
m:HL-f:FL        -7389937.49 -16324792  1544916.9 0.2496858
ntrc:HL-f:FL     -7122962.46 -16057817  1811891.9 0.3102106

我想研究这个简单的循环,这是我的例子,因为它看起来最简单.我将不胜感激!

I would like to work on this simple loop that is my example, because it seems easiest way. I will appreciate any help!

推荐答案

这是您要找的:

library(tidyverse)
library(broom)

read_csv(file = "https://filebin.net/w5cfuwztae7yk747/two.csv") %>%
  group_by(Accession) %>%
  do(broom::tidy(TukeyHSD(aov(value ~ genotype + Light + genotype:Light, data = .)))) %>%
  ungroup

输出:

# A tibble: 264 x 7
   Accession     term comparison    estimate   conf.low  conf.high  adj.p.value
   <chr>   <fctr>      <chr>       <dbl>      <dbl>      <dbl>        <dbl>
 1 AT2G41680 genotype        m-f -1586182.59 -3616647.7   444282.5 1.708496e-01
 2 AT2G41680 genotype     ntrc-f -5705550.95 -7694992.3 -3716109.6 2.609223e-08
 3 AT2G41680 genotype       WT-f -1568375.95 -3557817.3   421065.4 1.647950e-01
 4 AT2G41680 genotype     ntrc-m -4119368.37 -6149833.5 -2088903.3 2.214399e-05
 5 AT2G41680 genotype       WT-m    17806.64 -2012658.5  2048271.8 9.999951e-01
 6 AT2G41680 genotype    WT-ntrc  4137175.00  2147733.6  6126616.4 1.464605e-05
 7 AT2G41680    Light      HL-FL -3854435.85 -5849789.4 -1859082.3 4.872013e-05
 8 AT2G41680    Light      LL-FL  1528123.46  -467230.1  3523477.0 1.844033e-01
 9 AT2G41680    Light      ML-FL -2821752.94 -4775345.6  -868160.3 2.283331e-03
10 AT2G41680    Light      LL-HL  5382559.31  3311883.1  7453235.6 2.176770e-07
# ... with 254 more rows

这篇关于R,如何在数据帧的循环中进行方差分析?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆