R,如何在数据帧的循环中进行方差分析? [英] R, how to do anova in a loop over dataframe?
问题描述
我想做 2ways anova,并存储 p 值而不是 tukey hsd,但是我的初始表有问题.我并不总是有完整的数据,所以并不总是可以执行 anova,我不知道如何做到这一点,所以我的脚本运行,而不是跳过不完整的数据并进一步运行.我的数据如下所示:
I would like to do 2ways anova, and store the p value and than do tukey hsd, but i have a problem with the initial table. Not always I have full data, so not always it is possible to perfors anova, I dont know how to do this so my script runs, than skip the not full data and runns further. my data looks like this:
https://filebin.net/w5cfuwztae7yk747
在链接中有两个种质的示例,但在实际数据中有 3013 个种质,其中一些没有所有光照条件或所有基因型
in the link there is example with two Accessions, but in real data there is 3013 accessions and some of them dont have all light conditions or all genotypes
67822 AT2G41680 f HL_f_Dejan58 1.240108e+06 HL AT2G41680 f
70136 AT2G41680 f HL_f_Dejan_61 3.384010e+06 HL AT2G41680 f
72450 AT2G41680 ntrc HL_ntrc_ Dejan_62 1.410768e+05 HL AT2G41680 ntrc
74764 AT2G41680 ntrc HL_ntrc_Dejan_66 5.642197e+00 HL AT2G41680 ntrc
77078 AT2G41680 ntrc HL_ntrc_Dejan65 3.921952e+05 HL AT2G41680 ntrc
78997 AT2G41680 WT LL_WT_Dejan_41 1.016001e+07 LL AT2G41680 WT
81433 AT2G41680 WT LL_WT_Dejan_43 9.320892e+06 LL AT2G41680 WT
83869 AT2G41680 WT LL_WT_Dejan_49 8.560308e+06 LL AT2G41680 WT
有4种基因型,四种光照条件我正在尝试做这样的事情:
there is 4 genotypes, and four light conditions I am trying to do something like this:
AOV<- data.frame()
IDs<- unique(Dejan_all_new_norm$Accession)
for (i in 1 : length(IDs)){
temp<-Dejan_all_new_norm[(Dejan_all_new_norm$Accession)==IDs[i],]
aov2<-aov(value ~ genotype + Light + genotype:Light, data = temp)
AOV <- rbind(as.character(unique(IDs[i])),aov2,AOV)
}
所以我想对每个基因(Accession)进行子集化,而不是做方差分析,但在此之后我想做 tukey 有这样的事情:
so i want to subset each gene (Accession) and than do ANOVA, but after this i want do tukey to have something like this:
$`genotype:Light`
diff lwr upr p adj
m:FL-f:FL -7324259.81 -16715470 2066950.5 0.3486778
ntrc:FL-f:FL 1662873.54 -7728337 11054083.9 0.9999998
WT:FL-f:FL -5219263.59 -13913835 3475307.7 0.7927417
f:HL-f:FL -4936680.12 -13871535 3998174.3 0.8796738
m:HL-f:FL -7389937.49 -16324792 1544916.9 0.2496858
ntrc:HL-f:FL -7122962.46 -16057817 1811891.9 0.3102106
我想研究这个简单的循环,这是我的例子,因为它看起来最简单.我将不胜感激!
I would like to work on this simple loop that is my example, because it seems easiest way. I will appreciate any help!
推荐答案
这是您要找的:
library(tidyverse)
library(broom)
read_csv(file = "https://filebin.net/w5cfuwztae7yk747/two.csv") %>%
group_by(Accession) %>%
do(broom::tidy(TukeyHSD(aov(value ~ genotype + Light + genotype:Light, data = .)))) %>%
ungroup
输出:
# A tibble: 264 x 7
Accession term comparison estimate conf.low conf.high adj.p.value
<chr> <fctr> <chr> <dbl> <dbl> <dbl> <dbl>
1 AT2G41680 genotype m-f -1586182.59 -3616647.7 444282.5 1.708496e-01
2 AT2G41680 genotype ntrc-f -5705550.95 -7694992.3 -3716109.6 2.609223e-08
3 AT2G41680 genotype WT-f -1568375.95 -3557817.3 421065.4 1.647950e-01
4 AT2G41680 genotype ntrc-m -4119368.37 -6149833.5 -2088903.3 2.214399e-05
5 AT2G41680 genotype WT-m 17806.64 -2012658.5 2048271.8 9.999951e-01
6 AT2G41680 genotype WT-ntrc 4137175.00 2147733.6 6126616.4 1.464605e-05
7 AT2G41680 Light HL-FL -3854435.85 -5849789.4 -1859082.3 4.872013e-05
8 AT2G41680 Light LL-FL 1528123.46 -467230.1 3523477.0 1.844033e-01
9 AT2G41680 Light ML-FL -2821752.94 -4775345.6 -868160.3 2.283331e-03
10 AT2G41680 Light LL-HL 5382559.31 3311883.1 7453235.6 2.176770e-07
# ... with 254 more rows
这篇关于R,如何在数据帧的循环中进行方差分析?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!