R中使用for循环的卡方分析 [英] Chi Square Analysis using for loop in R

查看:71
本文介绍了R中使用for循环的卡方分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对数据中的所有变量组合进行卡方分析,我的代码是:

I'm trying to do chi square analysis for all combinations of variables in the data and my code is:

Data <- esoph[ , 1:3]
OldStatistic <- NA
for(i in 1:(ncol(Data)-1)){
for(j in (i+1):ncol(Data)){
Statistic <- data.frame("Row"=colnames(Data)[i], "Column"=colnames(Data)[j],
                     "Chi.Square"=round(chisq.test(Data[ ,i], Data[ ,j])$statistic, 3),
                     "df"=chisq.test(Data[ ,i], Data[ ,j])$parameter,
                     "p.value"=round(chisq.test(Data[ ,i], Data[ ,j])$p.value, 3),
                      row.names=NULL)
temp <- rbind(OldStatistic, Statistic)
OldStatistic <- Statistic
Statistic <- temp
}
}

str(Data)
'data.frame':   88 obs. of  3 variables:
 $ agegp: Ord.factor w/ 6 levels "25-34"<"35-44"<..: 1 1 1 1 1 1 1 1 1 1 ...
 $ alcgp: Ord.factor w/ 4 levels "0-39g/day"<"40-79"<..: 1 1 1 1 2 2 2 2 3 3 ...
 $ tobgp: Ord.factor w/ 4 levels "0-9g/day"<"10-19"<..: 1 2 3 4 1 2 3 4 1 2 ...


Statistic
    Row Column Chi.Square df p.value
1 agegp  tobgp      2.400 15       1
2 alcgp  tobgp      0.619  9       1

我的代码给出了变量 1 与变量 3 以及变量 2 与变量 3 的卡方分析输出,但变量 1 与变量 2 之间缺失.我努力尝试但无法修复代码.任何意见和建议将不胜感激.我想对所有可能的组合进行交叉制表.提前致谢.

My code gives my the chi square analysis output for variable 1 vs variable 3, and variable 2 vs variable 3 and is missing for variable 1 vs variable 2. I tried hard but could not fixed the code. Any comment and suggestion will be highly appreciated. I'd like like to do cross tabulation for all possible combinations. Thanks in advance.

编辑

我曾经在 SPSS 中进行过这种分析,但现在我想切换到 R.

I used to do this kind of analysis in SPSS but now I want to switch to R.

推荐答案

您的数据示例将不胜感激,但我认为这对您有用.首先,使用 combn 创建所有列的组合.然后编写一个函数与 apply 函数一起使用以遍历组合.我喜欢使用 plyr 因为它很容易在后端指定你想要的数据结构.另请注意,您只需为每个列组合计算一次卡方检验,这也会大大加快速度.

A sample of your data would be appreciated, but I think this will work for you. First, create a combination of all columns with combn. Then write a function to use with an apply function to iterate through the combos. I like to use plyr since it is easy to specify what you want for a data structure on the back end. Also note you only need to compute the chi square test once for each combination of columns, which should speed things up quite a bit as well.

library(plyr)

combos <- combn(ncol(Dat),2)

adply(combos, 2, function(x) {
  test <- chisq.test(Dat[, x[1]], Dat[, x[2]])

  out <- data.frame("Row" = colnames(Dat)[x[1]]
                    , "Column" = colnames(Dat[x[2]])
                    , "Chi.Square" = round(test$statistic,3)
                    ,  "df"= test$parameter
                    ,  "p.value" = round(test$p.value, 3)
                    )
  return(out)

})  

这篇关于R中使用for循环的卡方分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆