“anova_test"函数误差(0(非NA)情况)和双向重复方差分析的线性组合 [英] "anova_test" function error (0 (non_NA) cases) and linear combination for two-way repeated anova
问题描述
我正在尝试使用 anova_test 函数在 rstatix 包中.我大致按照教程找到 此处.我的数据由几个蚁群(Colony")组成,每个蚁群分为 3 个处理(Size").我收集了超过 8 个时间点(时间")的数据(g").我已经在 github 上上传了一个我的数据的子集,但这里是简要总结:
I am trying to run a two-way repeated measures anova in R using the anova_test function in the rstatix package. I am roughly following the tutorial found here. My data consists of sevaral ant colonies ("Colony"), each split into 3 treatments ("Size"). I collected data ("g") over 8 timepoints ("Time"). I have uploaded a subset of my data on github, but here is a brief summary:
# A tibble: 24 x 6
Species Colony Fragment Size Time g
<fct> <fct> <fct> <fct> <fct> <dbl>
1 obs 5 5L L 1 0.565
2 obs 2 2L L 2 0.002
3 obs 8 8L L 3 0.699
4 obs 12 12L L 4 0.257
5 obs 12 12L L 5 0.131
6 obs 3 3L L 6 0.014
7 obs 10 10L L 7 0.15
8 obs 12 12L L 8 0.054
9 obs 10 10M M 1 0.448
10 obs 8 8M M 2 0.135
# ... with 14 more rows
我尝试使用以下代码以三种不同的方式运行双向重复测量方差分析:
I have tried running the two-way repeated measure anova three different ways, with the following code:
aov <- df %>% anova_test(g ~ Size*Time + Error(Colony/(Size*Time)))
aov <- df %>% anova_test(dv=g, wid = Colony, within= c(Size,Time))
aov <- anova_test(data = df, dv=g, wid=Colony, within=c(Size, Time))
他们都输出了以下错误:
They each output the following error:
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases
我在两个格式与我的数据集相似的示例数据集上尝试了相同的代码,并且该函数运行良好(并且每种方法输出相同的结果).以下是示例数据集的摘要以供参考:
I have tried the same code on two sample datasets that are formatted similarly to my dataset, and the function works perfectly (and each method outputs the same results). Here are summaries of the sample datasets for reference:
# A tibble: 6 x 4
id treatment time score
<fct> <fct> <fct> <dbl>
1 7 ctr t1 92
2 6 ctr t2 65
3 12 ctr t3 62
4 6 Diet t1 76
5 9 Diet t2 94
6 7 Diet t3 87
# A tibble: 6 x 4
len supp dose id
<dbl> <fct> <dbl> <int>
1 21.5 OJ 0.5 2
2 14.5 OJ 1 9
3 22.4 OJ 2 3
4 4.2 VC 0.5 1
5 17.3 VC 1 4
6 29.5 VC 2 10
我已经验证我的数据没有任何 NA 值,any(is.na(df))
返回 FALSE.
I have verified that my data does not have any NA values with any(is.na(df))
which returns FALSE.
我遇到了一个类似的问题 和一个有用的海报表明这个错误可能是由于线性组合,而不是 NA 值.我决定使用 lm(g ~ Colony+Time:Size, data=df)
检查我的数据,事实上,我确实有一个线性组合:
I came across a similar question and one helpful poster suggested that this error might be due to a linear combination, rather than NA values. I decided to check my data using lm(g ~ Colony+Time:Size, data=df)
and, indeed, it appears that I do have a linear combination:
Call:
lm(formula = g ~ Colony + Time:Size, data = df)
Coefficients:
(Intercept) Colony1 Colony2 Colony3 Colony4 Colony5 Time1:SizeL Time2:SizeL Time3:SizeL
0.044167 -0.118549 -0.108424 0.076868 0.073243 0.034368 0.213000 0.351167 0.199833
Time4:SizeL Time5:SizeL Time6:SizeL Time7:SizeL Time8:SizeL Time1:SizeM Time2:SizeM Time3:SizeM Time4:SizeM
0.060667 0.071333 0.005000 0.017000 -0.029167 0.239667 0.216333 0.174667 0.050500
Time5:SizeM Time6:SizeM Time7:SizeM Time8:SizeM Time1:SizeS Time2:SizeS Time3:SizeS Time4:SizeS Time5:SizeS
0.069500 0.033167 0.011500 -0.003667 -0.015500 0.081167 0.020000 0.042500 0.026333
Time6:SizeS Time7:SizeS Time8:SizeS
-0.014333 -0.000500 NA
但是,我不明白为什么.Time8:SizeS 类别与所有其他 Time:Size 组合基本相同.如果有人能解释为什么我可能会遇到这个错误,或者有一个解决方案来解决我如何对我的数据执行双向重复测量 anova(有或没有 anova_test),我将不胜感激它!
However, I do not understand why. The Time8:SizeS category is essentially the same as all of the other Time:Size combinations. If anyone can explain why I might be running into this error or has a solution for how I could carry out a two-way repeated measures anova (with or without anova_test) on my data, I would greatly appreciate it!
提前致谢!
推荐答案
我需要再次阅读 rstatix::anova_test 的代码,但你的设计没问题,它是平衡的,导致所有问题的原因是额外的列.我怀疑在某个地方旋转会因为列而失控:
I need to read the code for rstatix::anova_test again, but your design is ok, it's balanced and what's causing all the problem is the extra columns. I suspect somewhere the pivoting goes haywire because of the columns:
library(rstatix)
library(dplyr)
df=read.csv("https://raw.githubusercontent.com/mwest9/sample_data/master/test_repeat_anova.csv")
df$Colony = factor(df$Colony)
df$Time = factor(df$Time)
df %>% select(g,Size,Time,Colony) %>%
anova_test(g ~ Size*Time + Error(Colony/(Size*Time)))
ANOVA Table (type III tests)
Effect DFn DFd F p p<.05 ges
1 Size 2 10 4.098 0.05000 0.075
2 Time 7 35 5.428 0.00028 * 0.209
3 Size:Time 14 70 1.595 0.10200 0.099
请注意,它仅报告方差分析,而不报告其他球形度测试:
Note it only reports the anova and not other test for sphericity:
Mauchly 的球形检验:如果任何 Ss 内变量具有更多存在超过 2 个级别,一个包含结果的数据框Mauchly 的球度检验.仅报告具有超过 2 个级别,因为球形度必然适用于效果只有 2 个级别.• 球度校正:如果在 Ss 内有任何存在变量,包含温室-盖瑟的数据框和 Huynh-Feldt epsilon 值,以及相应的校正 p 值.
Mauchly’s Test for Sphericity: If any within-Ss variables with more than 2 levels are present, a data frame containing the results of Mauchly’s test for Sphericity. Only reported for effects that have more than 2 levels because sphericity necessarily holds for effects with only 2 levels. • Sphericity Corrections: If any within-Ss variables are present, a data frame containing the Greenhouse-Geisser and Huynh-Feldt epsilon values, and corresponding corrected p-values.
这篇关于“anova_test"函数误差(0(非NA)情况)和双向重复方差分析的线性组合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!