无法运行双向重复测量方差分析;0(非NA)案例 [英] Unable to run Two-way repeated measures ANOVA; 0 (non-NA) cases

查看:37
本文介绍了无法运行双向重复测量方差分析;0(非NA)案例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试按照 Datanovia 的教程学习 双向重复测量方差分析.

I am trying to follow the tutorial by Datanovia for Two-way repeated measures ANOVA.

我的数据集的快速概览:

A quick overview of my dataset:

随着时间的推移,我测量了 12 个采样单位中不同细菌种类的数量.我有 16 个时间点和 2 个组.我将我的数据组织成一个名为丰富度"的小标题;

I have measured the number of different bacterial species in 12 samplingsunits over time. I have 16 time points and 2 groups. I have organised my data as a tibble called "richness";

# A tibble: 190 x 4
   id    selection.group Day   value
   <fct> <fct>           <fct> <dbl>
 1 KRH1  KR              2      111.
 2 KRH2  KR              2      141.
 3 KRH3  KR              2      110.
 4 KRH1  KR              4      126 
 5 KRH2  KR              4      144 
 6 KRH3  KR              4      135.
 7 KRH1  KR              6      115.
 8 KRH2  KR              6      113.
 9 KRH3  KR              6      107.
10 KRH1  KR              8      119.

id 是指每个抽样单元,选择组有两个因素(KR 和 RK).

The id refers to each sampling unit, and the selection group is of two factors (KR and RK).

richness <- tibble(
  id = factor(c("KRH1", "KRH3", "KRH2", "RKH2", "RKH1", "RKH3")), 
  selection.group = factor(c("KR", "KR", "KR", "RK", "RK", "RK")), 
  Day = factor(c(2,2,4,2,4,4)), 
  value = c(111, 110, 144,  92,  85,  69))  # subset of original data

我的 tibble 似乎与教程中的格式相同;

My tibble appears to be in an identical format as the one in the tutorial;

> str(selfesteem2)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   72 obs. of  4 variables:
 $ id       : Factor w/ 12 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ treatment: Factor w/ 2 levels "ctr","Diet": 1 1 1 1 1 1 1 1 1 1 ...
 $ time     : Factor w/ 3 levels "t1","t2","t3": 1 1 1 1 1 1 1 1 1 1 ...
 $ score    : num  83 97 93 92 77 72 92 92 95 92 ..

在运行重复测量方差分析之前,我必须检查数据的正态性.我复制了教程中提出的框架.

Before I can run the repeated measures ANOVA I must check for normality in my data. I copied the framework proposed in the tutorial.

#my code
richness %>%
  group_by(selection.group, Day) %>%
  shapiro_test(value)

#tutorial code
selfesteem2 %>%
  group_by(treatment, time) %>%
  shapiro_test(score)

但在我尝试运行代码时收到错误消息错误:列 variable 未知".有谁知道为什么会这样?

But get the error message "Error: Column variable is unknown" when I try to run the code. Does anyone know why this happens?

我试图在没有保证我的数据是正态分布的情况下继续并尝试运行方差分析

I tried to continue without insurance that my data is normally distributed and tried to run the ANOVA

res.aov <- rstatix::anova_test(
  data = richness, dv = value, wid = id,
  within = c(selection.group, Day)
  )

但是得到这个错误信息;lm.fit(x, y, offset = offset, single.ok = single.ok, ...) 中的错误:0(非NA)案例

But get this error message; Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 0 (non-NA) cases

我已经用 any(is.na(richness)) 检查了 NA 值,它返回 FALSE.我还检查了 table(richness$selection.group, richness$Day) 以确保我的设置是正确的

I have checked for NA values with any(is.na(richness)) which returns FALSE. I have also checked table(richness$selection.group, richness$Day) to be sure my setup is correct


     2 4 6 8 12 16 20 24 28 29 30 32 36 40 44 50
  KR 6 6 6 6  6  6  6  6  6  6  6  5  6  6  6  6
  RK 6 6 6 6  6  5  6  6  6  6  6  6  6  6  6  6

而且设置看起来是正确的.我将非常感谢有关解决此问题的提示.

And the setup appears correct. I would be very grateful for tips on solving this.

最好的问候玛德琳

以下是可重现格式的数据集子集:

Below is a subset of my dataset in a reproducible format:

library(tidyverse)
library(rstatix)
library(tibble)

richness_subset = data.frame(
  id = c("KRH1", "KRH3", "KRH2", "RKH2", "RKH1", "RKH3"), 
  selection.group = c("KR", "KR", "KR", "RK", "RK", "RK"), 
  Day = c(2,2,4,2,4,4), 
  value = c(111, 110, 144,  92,  85,  69))

richness_subset$Day = factor(richness$Day)
richness_subset$selection.group = factor(richness$selection.group)
richness_subset$id = factor(richness$id)

richness_subset = tibble::as_tibble(richness_subset)

richness_subset %>%
  group_by(selection.group, Day) %>%
  shapiro_test(value)

# gives Error: Column `variable` is unknown
res.aov <- rstatix::anova_test(
  data = richness, dv = value, wid = id,
  within = c(selection.group, Day)
)

# gives Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
#  0 (non-NA) cases

推荐答案

我创建的东西类似于您的数据设计:

I create something like the design of your data:

set.seed(111)
richness = data.frame(id=rep(c("KRH1","KRH2","KRH3"),6),
selection.group=rep(c("KR","RK"),each=9),
Day=rep(c(2,4,6),each=3,times=2),value=rpois(18,100))

richness$Day = factor(richness$Day)
richness$id = factor(richness$id)

首先,shapiro_test,脚本中存在一个错误,您要测试的值不能命名为value":

First, shapiro_test, there's a bug in the script and the value you wanna test cannot be named "value":

# gives error Error: Column `variable` is unknown
richness %>% shapiro_test(value)

#works
richness %>% mutate(X = value) %>% shapiro_test(X)
# A tibble: 1 x 3
  variable statistic     p
  <chr>        <dbl> <dbl>
1 X            0.950 0.422
1 X            0.963 0.843

其次,对于方差分析,这对我有用.

Second, for the anova, this works for me.

rstatix::anova_test(
  data = richness, dv = value, wid = id,
  within = c(selection.group, Day)
  )

在我的例子中,每个术语都可以估计.我怀疑你的一个术语是另一个的线性组合.使用我的例子,

In my example every term can be estimated.. What I suspect is that one of your terms is a linear combination of the other. Using my example,

set.seed(111)
richness =
data.frame(id=rep(c("KRH1","KRH2","KRH3","KRH4","KRH5","KRH6"),3),
selection.group=rep(c("KR","RK"),each=9),
Day=rep(c(2,4,6),each=3,times=2),value=rpois(18,100))

richness$Day = factor(richness$Day)
richness$id = factor(richness$id)

rstatix::anova_test(
  data = richness, dv = value, wid = id,
  within = c(selection.group, Day)
  )

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  0 (non-NA) cases

给出完全相同的错误.这可以使用以下方法检查:

Gives the exact same error. This can be checked using:

lm(value~id+Day:selection.group,data=richness)


   Call:
lm(formula = value ~ id + Day:selection.group, data = richness)

Coefficients:
           (Intercept)                     id1                     id2  
               101.667                  -3.000                  -6.000  
                   id3                     id4                     id5  
                -6.000                   1.889                  11.556  
Day2:selection.groupKR  Day4:selection.groupKR  Day6:selection.groupKR  
                 1.667                 -12.000                   9.333  
Day2:selection.groupRK  Day4:selection.groupRK  Day6:selection.groupRK  
                -1.667                      NA                      NA 

Day4:selection.groupRK 和 Day6:selection.groupRK 不可估计,因为它们之前被因子的线性组合所覆盖.

The Day4:selection.groupRK and Day6:selection.groupRK are not estimateable because they are covered by a linear combination of factors before.

这篇关于无法运行双向重复测量方差分析;0(非NA)案例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆