方差分析中“mutate()"输入“data"的问题(rstatix) [英] Problem with 'mutate()' input 'data' in ANOVA (rstatix)

查看:28
本文介绍了方差分析中“mutate()"输入“data"的问题(rstatix)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这让我发疯.我正在使用 rstatix 的 anova_test,它告诉我我的列在它们明显存在时不存在.

This is driving me crazy. I am using anova_test from rstatix and it's telling me that my columns aren't there when they clearly are.

这是我的数据框的样子:

This is what my dataframe looks like:

ID = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3) 
Form = c("A", "A", "A", "B", "B", "B", "A", "A", "A", "B", "B", "B", "A", "A", "A", "B", "B", "B")
Pen = c("Red", "Blue", "Green", "Red", "Blue", "Green", "Red", "Blue", "Green","Red", "Blue", "Green","Red", "Blue", "Green","Red", "Blue", "Green")
Time = c(20, 4, 6, 2, 76, 3, 86, 35, 74, 94, 14, 35, 63, 12, 15, 73, 87, 33)
df <- data.frame(ID, Form, Pen, Time)

ID、Form、Pen 是因子,Time 是数字.因此,每个受试者都用红色、蓝色和绿色钢笔完成了表格 A 和 B,我测量了每个人完成表格所用的时间.

ID, Form, Pen are factors, Time is numeric. So each subject completed forms A and B with Red, Blue, and Green pens, and I measured how long each took in completing the form.

这是我特意提出来提出这个问题的假数据集.实际上,此数据框源自具有更多变量的更大数据集.每个变量都有更多的观察值(因此,对于主题 1、表格 A 和红笔,不仅仅是一个数据点,如本例所示,而是多个),因此我进行了总结以获得平均时间.

This is a fake dataset that I've purposefully come up with to ask this question. In reality, this dataframe is derived from a larger dataset with several more variables. Each variable has a lot more observation (so not just one datapoint for subject 1 & Form A & Red Pen, as in this example, but multiple), so I've summarized to get mean Time.

df <- original.df %>% dplyr::select(ID, Form, Pen, Time)
df <- df %>% dplyr::group_by(ID, Form, Pen) %>% dplyr::summarise(Time = mean(Time))
df <- df %>% convert_as_factor(ID, Form, Pen)
df$Time <- as.numeric(df$Time)

我想测试主要效果和交互效果,所以我进行了 2 x 3 重复测量方差分析(一种双向方差分析,因为 Form 和 Pen 是两个自变量).

I wanted to test the main and interactive effects, so I'm doing a 2 by 3 repeated measures ANOVA (a two way ANOVA, because Form and Pen are two independent variables).

aov <- rstatix::anova_test(data = df, dv = Time, wid = ID, within = c(Form, Pen))

而且我不断收到此错误:

and I KEEP getting this error:

Error: Problem with `mutate()` input `data`.
x Can't subset columns that don't exist.
x Columns `ID` and `Form` don't exist.
ℹ Input `data` is `map(.data$data, .f, ...)`.

为什么?!任何帮助将不胜感激.我一直在寻找 HOURS 的解决方案,但我感到非常沮丧.

WHY?! Any help would be greatly appreciated. I've been searching solutions for HOURS and I'm getting pretty frustrated.

推荐答案

感谢您在帖子中添加其他详细信息 - 根据您提供的内容,您似乎需要在将其传递给 anova_test(),例如

Thank you for adding the additional details to the post - based on what you've provided it looks like you need to ungroup your df before passing it to anova_test(), e.g.

#install.packages("rstatix")
library(rstatix)
library(tidyverse)

ID = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3) 
Form = c("A", "A", "A", "B", "B", "B", "A", "A", "A", "B", "B", "B", "A", "A", "A", "B", "B", "B")
Pen = c("Red", "Blue", "Green", "Red", "Blue", "Green", "Red", "Blue", "Green","Red", "Blue", "Green","Red", "Blue", "Green","Red", "Blue", "Green")
Time = c(20, 4, 6, 2, 76, 3, 86, 35, 74, 94, 14, 35, 63, 12, 15, 73, 87, 33)
original.df <- data.frame(ID, Form, Pen, Time)

df <- original.df %>%
  dplyr::select(ID, Form, Pen, Time)
df <- df %>%
  dplyr::group_by(ID, Form, Pen) %>%
  dplyr::summarise(Time = mean(Time))
df <- df %>%
  convert_as_factor(ID, Form, Pen)
df$Time <- as.numeric(df$Time)
df <- ungroup(df)

aov <- rstatix::anova_test(data = df, dv = Time, wid = ID, within = c(Form, Pen))

您可以使用 str() 查看数据帧是否分组,例如str(df) 之前和之后 ungrouped() 告诉你区别.如果您在进行此更改后仍然遇到错误,请告诉我

You can see whether a dataframe is grouped using str(), e.g. str(df) before and after ungrouped() shows you the difference. Please let me know if you are still getting errors after making this change

这篇关于方差分析中“mutate()"输入“data"的问题(rstatix)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆