Inference() 函数坚持我使用方差分析与两侧假设检验;R/RStudio [英] Inference() Function Insisting That I Use ANOVA Versus Two-Sided Hypothesis Test; R/RStudio

查看:38
本文介绍了Inference() 函数坚持我使用方差分析与两侧假设检验;R/RStudio的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用名为 Inference() 的自定义函数,如下面的代码所示.没有该函数的文档,但它来自我在 Coursera 中的 DASI 课程.根据我收到的反馈,我正在正确使用该功能.我正在尝试在我的阶级变量和我的 wordsum 变量之间进行双边假设检验,即在低阶级和工人阶级这两个类别的两个均值之间.因此,工人阶级的平均字数 - 下层阶级的平均字数.但是,函数/R/R Studio 一直坚持要我做 ANOVA 测试.这对我不起作用,因为我试图拒绝空值,并在两个独立均值的差异之间创建置信区间.我看过这个函数,但由于我不是 R 专家,我没有看到任何异常.非常感谢任何帮助.

I'm trying to use a custom function called Inference() as seen in the code below. There's no documentation for the function, but it is from my DASI class in Coursera. According to the feedback I have received, I am using the function properly. I'm trying to do a two-sided hypothesis test between my class variable and my wordsum variable, that is, between the two means of the categories low class and working class. So, the average wordsum for working class - average wordsum for lower class. However, the function/R/R Studio keep insisting I do an ANOVA test. This doesn't work for me since I'm trying to reject the null, and create a confidence interval between the difference of two independent means. I've looked at the function, but as I'm no R expert, I don't see anything out of the ordinary. Any help is greatly appreciated.

代码:

load(url("http://bit.ly/dasi_gss_ws_cl"))
source("http://bit.ly/dasi_inference")

summary(gss)
by(gss$wordsum, gss$class, mean)
boxplot(gss$wordsum ~ gss$class)

gss_clean = na.omit(subset(gss, class == "WORKING" | class =="LOWER"))

inference(y = gss_clean$wordsum, x = gss_clean$class, est = "mean", type = "ht", 
          null = 0, alternative = "twosided", method = "theoretical")

退货:

Response variable: numerical, Explanatory variable: categorical
Error: Use alternative = 'greater' for ANOVA or chi-square test.
In addition: Warning message:
Ignoring null value since it's undefined for ANOVA.

推荐答案

您需要

gss_clean <- droplevels(gss_clean)

然后您的 inference() 调用有效:

Then your inference() call works:

Response variable: numerical, Explanatory variable: categorical
Difference between two means
Summary statistics:
n_LOWER = 41, mean_LOWER = 5.0732, sd_LOWER = 2.2404
n_WORKING = 407, mean_WORKING = 5.7494, sd_WORKING = 1.8652
Observed difference between means (LOWER-WORKING) = -0.6762
H0: mu_LOWER - mu_WORKING = 0 
HA: mu_LOWER - mu_WORKING != 0 
Standard error = 0.362 
Test statistic: Z =  -1.868 
p-value =  0.0616 

问题是,除非你去掉因子的未使用级别,inference() 的内部机制认为你有一个 4 级分类变量,它不能做-test 或等效的 2 类测试:它必须进行单向方差分析或模拟.

The problem is that unless you drop the unused levels of the factor, the internal machinery of inference() thinks that you have a 4-level categorical variable, and it can't do a t-test or equivalent 2-category test: it has to do a one-way ANOVA or analogue.

这篇关于Inference() 函数坚持我使用方差分析与两侧假设检验;R/RStudio的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆