计算 R 中的方差和置信区间之内和之间 [英] Calculate within and between variances and confidence intervals in R

查看:66
本文介绍了计算 R 中的方差和置信区间之内和之间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

作为开发新的分析化学方法的一部分,我需要根据一些数据计算运行内和运行之间的差异.我还需要使用 R 语言从这些数据中获得置信区间

我想我需要使用方差分析之类的吗?

我的数据就像

<代码>>方差运行代表值1 1 1 9.852 1 2 9.953 1 3 10.004 2 1 9.905 2 2 8.806 2 3 9.507 3 1 11.208 3 2 11.109 3 3 9.8010 4 1 9.7011 4 2 10.1012 4 3 10.00

解决方案

我一直在研究一个类似的问题.我找到了 Burdick 和 Graybill 对计算置信区间的参考(Burdick, R. 和 Graybill, F. 1992,方差分量的置信区间,CRC Press)

使用一些我一直在尝试的代码获得这些值

<预><代码>> kiaraov = aov(Value~Run+Error(Run),data=kiar)> 总结(kiaraov)错误:运行Df Sum Sq 均值 Sq运行 3 2.57583 0.85861错误:在Df Sum Sq Mean Sq F 值 Pr(>F)残差 8 1.93833 0.24229> 约束 = 95> a = (1-(confint/100))/2> Grandmean = as.vector(kiaraov$"(Intercept)"[[1]][1]) # Grand Mean(我认为)> inside = summary(kiaraov)$"Error: Within"[[1]]$"Mean Sq" # S2^2Mean Square Value for Within Run> dfRun = summary(kiaraov)$"错误:运行"[[1]]$"df"> dfWithin = summary(kiaraov)$"错误:在"[[1]]$"Df"内> Run = summary(kiaraov)$"Error: Run"[[1]]$"Mean Sq" # S1^2Mean Square for between Run> between = (Run-within)/((dfWithin/(dfRun+1))+1) # (S1^2-S2^2)/J> 总计 = 介于 + 介于> between # 运行差异之间[1] 0.2054398> 内 # 内运行差异[1] 0.2422917> total # 总方差[1] 0.4477315> betweenCV = sqrt(between)/grandmean * 100 # 运行之间的 CV%> insideCV = sqrt(within)/grandmean * 100 # 运行内 CV%> totalCV = sqrt(total)/grandmean * 100 # 总 CV%> #within 置信区间> insideLCB = inside/qf(1-a,8,Inf) # LCB 内> insideUCB = inside/qf(a,8,Inf) # 在 UCB 内> #置信区间之间> n1 = dfRun> n2 = dfWithin> G1 = 1-(1/qf(1-a,n1,Inf)) # 根据 Burdick 和 Graybill 这应该是一个> G2 = 1-(1/qf(1-a,n2,Inf))> H1 = (1/qf(a,n1,Inf))-1 # 这应该是 1-a,但我的结果不同意> H2 = (1/qf(a,n2,Inf))-1> G12 = ((qf(1-a,n1,n2)-1)^2-(G1^2*qf(1-a,n1,n2)^2)-(H2^2))/qf(1-a,n1,n2) # 再次,应该是 a,而不是 1-a> H12 = ((1-qf(a,n1,n2))^2-H1^2*qf(a,n1,n2)^2-G2^2)/qf(a,n1,n2) # 再次,应该是1-a,而不是a> Vu = H1^2*Run^2+G2^2*within^2+H12*Run*within> Vl = G1^2*Run^2+H2^2*within^2+G12*within*Run> betweenLCB = (Run-within-sqrt(Vl))/J #Betwen LCB> betweenUCB = (Run-within+sqrt(Vu))/J # UCB之间> #总置信区间> y = (Run+(J-1)*within)/J> totalLCB = y-(sqrt(G1^2*Run^2+G2^2*(J-1)^2*within^2)/J) #总LCB> totalUCB = y+(sqrt(H1^2*Run^2+H2^2*(J-1)^2*within^2)/J) #总UCB> result = data.frame(Name=c("within", "between", "total"),CV=c(withinCV,betweenCV,totalCV),LCB=c(sqrt(withinLCB)/grandmean*100,sqrt(betweenLCB)/grandmean*100,sqrt(totalLCB)/grandmean*100),UCB=c(sqrt(withinUCB)/grandmean*100,sqrt(betweenUCB)/grandmean*100,sqrt(totalUCB)/grandmean*100))> 结果姓名 CV LCB UCB1 以内 4.926418 3.327584 9.437892 之间 4.536327 NaN 19.735683 共 6.696855 4.846030 20.42647

此处运行 CV 之间的较低置信区间小于零,因此报告为 NaN.

我很想有一个更好的方法来做到这一点.如果我有时间,我可能会尝试创建一个函数来执行此操作.

保罗.

--

我最终确实写了一个函数,这里是(警告空手)

#' avar 函数#'#' 通过方差分析计算数据集的内、间和总 %CV,以及#' 相关置信区间#'#' @param dataf - 要使用的数据框,长格式#' @param afactor 表示dataf中包含因子的列的字符串#' @param aresponse 字符字符串,表示 dataf 中包含响应值的列#' @param aconfidence 使用什么置信度限制,默认值 = 95%#' @paramdigits 要报告的有效数字,默认值 = 3#' @param debug 布尔值,是否应显示调试消息,默认值=FALSE#' @returnType 数据框包含每个的平均值、范围内、之间和总计 %CV 以及 LCB 和 UCB#' @返回#'@作者保罗赫利#' @出口#' @例子#' #使用来自 Burdick 和 Graybill 的 BGBottles 数据 第 62 页#'assayvar(dataf=BGBottles, afactor="Machine", aresponse="weight")avar<-function(dataf,afactor,aresponse,aconfidence=95,digits=3,debug=FALSE){dataf<-subset(dataf,!is.na(with(dataf,get(aresponse))))nmissing<-function(x) sum(!is.na(x))n<-nrow(subset(dataf,is.numeric(with(dataf,get(aresponse)))))datadesc<-ddply(dataf, afactor, colwise(nmissing,aresponse))I<-nrow(datadesc)如果(调试){打印(数据描述)}if(min(datadesc[,2])==max(datadesc[,2])){平衡<-TRUEJ<-min(datadesc[,2])if(debug){message(paste("Dataset 是平衡的,J=",J,"I is ",I,sep=""))}} 别的 {平衡<-FALSEJh<-I/(sum(1/datadesc[,2], na.rm = TRUE))J<-Jhm<-min(datadesc[,2])M<-max(datadesc[,2])if(debug){message(paste("数据集不平衡,像我一样,我是",I,sep=""))}if(debug){message(paste("Jh is ",Jh, ", m is ",m, ", M is ",M, sep=""))}}if(debug){message(paste("Call afactor=",afactor,", aresponse=",aresponse,sep=""))}公式文本<-paste(as.character(aresponse)," ~ 1 + Error(",as.character(afactor),")",sep="")if(debug){message(paste("formula text is ",formulatext,sep=""))}aovformula<-formula(formulatext)if(debug){message(paste("Formula is ",as.character(aovformula),sep=""))}分析aov<-aov(公式=aovformula,data=dataf)如果(调试){打印(阿萨亚夫)打印(摘要(assayaov))}a<-1-((1-(aconfidence/100))/2)if(debug){message(paste("confidence is ",aconfidence,", alpha is ",a,sep=""))}Grandmean<-as.vector(assayaov$"(Intercept)"[[1]][1]) # Grand Mean(我认为)if(debug){message(paste("n is",n,sep=""))}#这一行注释掉了,似乎被一个从外部公式构建的 aov 对象窒息了#grandmean<-as.vector(model.tables(assayaov,type="means")[[1]]$`Grand mean`) # Grand Mean(我认为)内<-summary(assayaov)[[2]][[1]]$"Mean Sq" # d2e, S2^2 机器内的均方值 = 0.1819dfRun<-summary(assayaov)[[1]][[1]]$"Df" # DF for inside = 3dfWithin<-summary(assayaov)[[2]][[1]]$"Df" # DF for inside = 8Run<-summary(assayaov)[[1]][[1]]$"Mean Sq" # S1^2Mean Square for Machine如果(调试){消息(粘贴(运行的均方?",运行,sep ="))}#Was between<-(Run-within)/((dfWithin/(dfRun+1))+1) 但我的评论表明这应该只是 J,所以我将使用 J !之间<-(内运行)/J#d2a(S1^2-S2^2)/Jif(debug){message(paste("S1^2均方机为",Run,", S2^2均方内为",within))}总计<-之间+之内之间 # 运行差异之间在#内运行差异内total # 总方差if(debug){message(paste("between is ",between,", inside is ",within,", Total is ",total,sep=""))}betweenCV<-sqrt(between)/grandmean * 100 # 运行之间的 CV%insideCV<-sqrt(within)/grandmean * 100 # 运行内 CV%totalCV<-sqrt(total)/grandmean * 100 # Total CV%n1<-dfRunn2<-df内if(debug){message(paste("n1 是 ",n1,", n2 是 ",n2,sep=""))}#在置信区间内如果(余额){insideLCB<-within/qf(a,n2,Inf) # LCB内在UCB<-within/qf(1-a,n2,Inf) # 在UCB内} 别的 {insideLCB<-within/qf(a,n2,Inf) # LCB内在UCB<-within/qf(1-a,n2,Inf) # 在UCB内}#平均置信区间if(debug){message(paste(grandmean,"+/-(sqrt(",Run,"/",n,")*qt(",a,",df=",I-1,"))",sep=""))}meanLCB<-grandmean+(sqrt(Run/n)*qt(1-a,df=I-1)) # 错误meanUCB<-grandmean-(sqrt(Run/n)*qt(1-a,df=I-1)) # 错误if(debug){message(paste("Grandmean is ",grandmean,", meanLCB = ",meanLCB,", meanUCB = ",meanUCB,aresponse,sep=""))}如果(调试){打印(摘要(assayaov))}#置信区间之间G1<-1-(1/qf(a,n1,Inf))G2<-1-(1/qf(a,n2,Inf))H1<-(1/qf(1-a,n1,Inf))-1H2<-(1/qf(1-a,n2,Inf))-1G12<-((qf(a,n1,n2)-1)^2-(G1^2*qf(a,n1,n2)^2)-(H2^2))/qf(a,n1,n2)H12<-((1-qf(1-a,n1,n2))^2-H1^2*qf(1-a,n1,n2)^2-G2^2)/qf(1-a,n1,n2)if(debug){message(paste("G1 是",G1,", G2 是",G2,sep=""))消息(粘贴(H1是,H1,",H2是,H2,sep ="))消息(粘贴(G12是,G12,",H12是,H12,sep ="))}如果(余额){Vu<-H1^2*Run^2+G2^2*within^2+H12*Run*withinVl<-G1^2*Run^2+H2^2*within^2+G12*within*RunbetweenLCB<-(Run-within-sqrt(Vl))/J#Betwen LCBbetweenUCB<-(Run-within+sqrt(Vu))/J # UCB之间} 别的 {#Burdick 和 Graybill 似乎建议计算平均值的方差分析以找到 n1S12u/Jhmeandataf<-ddply(.data=dataf,.variable=afactor,.fun=function(df){mean(with(df, get(aresponse)), na.rm=TRUE)})meandataaov<-aov(formula(paste("V1~",afactor,sep="")), data=meandataf)sumsquare<-summary(meandataaov)[[1]]$`Sum Sq`#所以也许 S12u 只是那一点?Runu<-(sumsquare*Jh)/n1if(debug){message(paste("n1S12u/Jh 是 ",sumsquare,", 所以 S12u 是 ",Runu,sep=""))}Vu<-H1^2*Runu^2+G2^2*within^2+H12*Runu*withinVl<-G1^2*Runu^2+H2^2*within^2+G12*within*RunubetweenLCB<-(Runu-within-sqrt(Vl))/Jh #Betwen LCBbetweenUCB<-(Runu-within+sqrt(Vu))/Jh # UCB之间如果(调试){消息(粘贴(LCB之间是,LCB之间,",UCB之间是,UCB之间,sep ="))}}#总置信区间如果(余额){y<-(Run+(J-1)*within)/Jif(debug){message(paste("y is ",y,sep=""))}totalLCB<-y-(sqrt(G1^2*Run^2+G2^2*(J-1)^2*within^2)/J) #总LCBtotalUCB<-y+(sqrt(H1^2*Run^2+H2^2*(J-1)^2*within^2)/J) #总UCB} 别的 {y<-(Runu+(Jh-1)*within)/Jhif(debug){message(paste("y is ",y,sep=""))}totalLCB<-y-(sqrt(G1^2*Runu^2+G2^2*(Jh-1)^2*within^2)/Jh​​) #总LCBtotalUCB<-y+(sqrt(H1^2*Runu^2+H2^2*(Jh-1)^2*within^2)/Jh​​) #总UCB}if(debug){message(paste("totalLCB 是 ",totalLCB,", 总 UCB 是 ",totalUCB,sep=""))}# result<-data.frame(Name=c("within", "between", "total"),CV=c(withinCV,betweenCV,totalCV),# LCB=c(sqrt(withinLCB)/grandmean*100,sqrt(betweenLCB)/grandmean*100,sqrt(totalLCB)/grandmean*100),# UCB=c(sqrt(withinUCB)/grandmean*100,sqrt(betweenUCB)/grandmean*100,sqrt(totalUCB)/grandmean*100))结果<-data.frame(Mean=grandmean,MeanLCB=meanLCB,MeanUCB=meanUCB, Within=withinCV,WithinLCB=sqrt(withinLCB)/grandmean*100, WithinUCB=sqrt(withinUCB)/grandmean*100,之间=CV之间,LCB之间=sqrt(LCB之间)/grandmean*100,UCB之间=sqrt(UCB之间)/grandmean*100,Total=totalCV,TotalLCB=sqrt(totalLCB)/grandmean*100,TotalUCB=sqrt(totalUCB)/grandmean*100)if(!digits=="NA"){结果$Mean<-signif(result$Mean,digits=digits)result$MeanLCB<-signif(result$MeanLCB,digits=digits)result$MeanUCB<-signif(result$MeanUCB,digits=digits)结果$Within<-signif(result$Within,digits=digits)结果$WithinLCB<-signif(result$WithinLCB,digits=digits)结果$WithinUCB<-signif(result$WithinUCB,digits=digits)result$Between<-signif(result$Between,digits=digits)结果$BetweenLCB<-signif(result$BetweenLCB,digits=digits)结果$BetweenUCB<-signif(result$BetweenUCB,digits=digits)result$Total<-signif(result$Total,digits=digits)result$TotalLCB<-signif(result$TotalLCB,digits=digits)result$TotalUCB<-signif(result$TotalUCB,digits=digits)}返回(结果)}分析变量<-函数(数据,响应,因子,异常,置信度 = 95,数字 = 3,调试 = FALSE){结果<-ddply(adata,anominal,function(df){结果<-avar(dataf=df,afactor=afactor,aresponse=aresponse,aconfidence=aconfidence,digits=digits,debug=debug)结果 $n<-nrow(subset(df, !is.na(with(df, get(aresponse)))))返回(结果)})返回(结果)}

I need to calculate the within and between run variances from some data as part of developing a new analytical chemistry method. I also need confidence intervals from this data using the R language

I assume I need to use anova or something ?

My data is like

> variance
   Run Rep Value
1    1   1  9.85
2    1   2  9.95
3    1   3 10.00
4    2   1  9.90
5    2   2  8.80
6    2   3  9.50
7    3   1 11.20
8    3   2 11.10
9    3   3  9.80
10   4   1  9.70
11   4   2 10.10
12   4   3 10.00

解决方案

I've been looking at a similar problem. I've found reference to caluclating confidence intervals by Burdick and Graybill (Burdick, R. and Graybill, F. 1992, Confidence Intervals on variance components, CRC Press)

Using some code I've been trying I get these values



> kiaraov = aov(Value~Run+Error(Run),data=kiar)

> summary(kiaraov)

Error: Run
    Df  Sum Sq Mean Sq
Run  3 2.57583 0.85861

Error: Within
          Df  Sum Sq Mean Sq F value Pr(>F)
Residuals  8 1.93833 0.24229               
> confint = 95

> a = (1-(confint/100))/2

> grandmean = as.vector(kiaraov$"(Intercept)"[[1]][1]) # Grand Mean (I think)

> within = summary(kiaraov)$"Error: Within"[[1]]$"Mean Sq"  # S2^2Mean Square Value for Within Run

> dfRun = summary(kiaraov)$"Error: Run"[[1]]$"Df"

> dfWithin = summary(kiaraov)$"Error: Within"[[1]]$"Df"

> Run = summary(kiaraov)$"Error: Run"[[1]]$"Mean Sq" # S1^2Mean Square for between Run

> between = (Run-within)/((dfWithin/(dfRun+1))+1) # (S1^2-S2^2)/J

> total = between+within

> between # Between Run Variance
[1] 0.2054398

> within # Within Run Variance
[1] 0.2422917

> total # Total Variance
[1] 0.4477315

> betweenCV = sqrt(between)/grandmean * 100 # Between Run CV%

> withinCV = sqrt(within)/grandmean * 100 # Within Run CV%

> totalCV = sqrt(total)/grandmean * 100 # Total CV%

> #within confidence intervals

> withinLCB = within/qf(1-a,8,Inf) # Within LCB

> withinUCB = within/qf(a,8,Inf) # Within UCB

> #Between Confidence Intervals

> n1 = dfRun

> n2 = dfWithin

> G1 = 1-(1/qf(1-a,n1,Inf)) # According to Burdick and Graybill this should be a

> G2 = 1-(1/qf(1-a,n2,Inf))

> H1 = (1/qf(a,n1,Inf))-1  # and this should be 1-a, but my results don't agree

> H2 = (1/qf(a,n2,Inf))-1

> G12 = ((qf(1-a,n1,n2)-1)^2-(G1^2*qf(1-a,n1,n2)^2)-(H2^2))/qf(1-a,n1,n2) # again, should be a, not 1-a

> H12 = ((1-qf(a,n1,n2))^2-H1^2*qf(a,n1,n2)^2-G2^2)/qf(a,n1,n2) # again, should be 1-a, not a

> Vu = H1^2*Run^2+G2^2*within^2+H12*Run*within

> Vl = G1^2*Run^2+H2^2*within^2+G12*within*Run

> betweenLCB = (Run-within-sqrt(Vl))/J # Betwen LCB

> betweenUCB = (Run-within+sqrt(Vu))/J # Between UCB

> #Total Confidence Intervals

> y = (Run+(J-1)*within)/J

> totalLCB = y-(sqrt(G1^2*Run^2+G2^2*(J-1)^2*within^2)/J) # Total LCB

> totalUCB = y+(sqrt(H1^2*Run^2+H2^2*(J-1)^2*within^2)/J) # Total UCB

> result = data.frame(Name=c("within", "between", "total"),CV=c(withinCV,betweenCV,totalCV),LCB=c(sqrt(withinLCB)/grandmean*100,sqrt(betweenLCB)/grandmean*100,sqrt(totalLCB)/grandmean*100),UCB=c(sqrt(withinUCB)/grandmean*100,sqrt(betweenUCB)/grandmean*100,sqrt(totalUCB)/grandmean*100))

> result
     Name       CV      LCB      UCB
1  within 4.926418 3.327584  9.43789
2 between 4.536327      NaN 19.73568
3   total 6.696855 4.846030 20.42647

Here the lower confidence interval for between run CV is less than zero, so reported as NaN.

I'd love to have a better way to do this. If I get time I might try to create a function to do this.

Paul.

--

Edit: I did eventually write a function, here it is (caveat emptor)

#' avar Function
#' 
#' Calculate thewithin, between and total %CV of a dataset by ANOVA, and the
#' associated confidence intervals
#' 
#' @param dataf - The data frame to use, in long format 
#' @param afactor Character string representing the column in dataf that contains the factor
#' @param aresponse  Charactyer string representing the column in dataf that contains the response value
#' @param aconfidence What Confidence limits to use, default = 95%
#' @param digits  Significant Digits to report to, default = 3
#' @param debug Boolean, Should debug messages be displayed, default=FALSE
#' @returnType dataframe containing the Mean, Within, Between and Total %CV and LCB and UCB for each
#' @return 
#' @author Paul Hurley
#' @export
#' @examples 
#' #Using the BGBottles data from Burdick and Graybill Page 62
#' assayvar(dataf=BGBottles, afactor="Machine", aresponse="weight")
avar<-function(dataf, afactor, aresponse, aconfidence=95, digits=3, debug=FALSE){
    dataf<-subset(dataf,!is.na(with(dataf,get(aresponse))))
    nmissing<-function(x) sum(!is.na(x))
    n<-nrow(subset(dataf,is.numeric(with(dataf,get(aresponse)))))
    datadesc<-ddply(dataf, afactor, colwise(nmissing,aresponse))
    I<-nrow(datadesc)
    if(debug){print(datadesc)}
    if(min(datadesc[,2])==max(datadesc[,2])){
        balance<-TRUE
        J<-min(datadesc[,2])
        if(debug){message(paste("Dataset is balanced, J=",J,"I is ",I,sep=""))}
    } else {
        balance<-FALSE
        Jh<-I/(sum(1/datadesc[,2], na.rm = TRUE))
        J<-Jh
        m<-min(datadesc[,2])
        M<-max(datadesc[,2])
        if(debug){message(paste("Dataset is unbalanced, like me, I is ",I,sep=""))}
        if(debug){message(paste("Jh is ",Jh, ", m is ",m, ", M is ",M, sep=""))}
    }
    if(debug){message(paste("Call afactor=",afactor,", aresponse=",aresponse,sep=""))}
    formulatext<-paste(as.character(aresponse)," ~ 1 + Error(",as.character(afactor),")",sep="")
    if(debug){message(paste("formula text is ",formulatext,sep=""))}
    aovformula<-formula(formulatext)
    if(debug){message(paste("Formula is ",as.character(aovformula),sep=""))}
    assayaov<-aov(formula=aovformula,data=dataf)
    if(debug){
        print(assayaov)
        print(summary(assayaov))
    }
    a<-1-((1-(aconfidence/100))/2)
    if(debug){message(paste("confidence is ",aconfidence,", alpha is ",a,sep=""))}
    grandmean<-as.vector(assayaov$"(Intercept)"[[1]][1]) # Grand Mean (I think)
    if(debug){message(paste("n is",n,sep=""))}

    #This line commented out, seems to choke with an aov object built from an external formula
    #grandmean<-as.vector(model.tables(assayaov,type="means")[[1]]$`Grand mean`) # Grand Mean (I think)
    within<-summary(assayaov)[[2]][[1]]$"Mean Sq"  # d2e, S2^2 Mean Square Value for Within Machine = 0.1819
    dfRun<-summary(assayaov)[[1]][[1]]$"Df"  # DF for within = 3
    dfWithin<-summary(assayaov)[[2]][[1]]$"Df"  # DF for within = 8
    Run<-summary(assayaov)[[1]][[1]]$"Mean Sq" # S1^2Mean Square for Machine
    if(debug){message(paste("mean square for Run ?",Run,sep=""))}
    #Was between<-(Run-within)/((dfWithin/(dfRun+1))+1) but my comment suggests this should be just J, so I'll use J !
    between<-(Run-within)/J # d2a (S1^2-S2^2)/J
    if(debug){message(paste("S1^2 mean square machine is ",Run,", S2^2 mean square within is ",within))}
    total<-between+within
    between # Between Run Variance
    within # Within Run Variance
    total # Total Variance
    if(debug){message(paste("between is ",between,", within is ",within,", Total is ",total,sep=""))}

    betweenCV<-sqrt(between)/grandmean * 100 # Between Run CV%
    withinCV<-sqrt(within)/grandmean * 100 # Within Run CV%
    totalCV<-sqrt(total)/grandmean * 100 # Total CV%
    n1<-dfRun
    n2<-dfWithin
    if(debug){message(paste("n1 is ",n1,", n2 is ",n2,sep=""))}
    #within confidence intervals
    if(balance){
        withinLCB<-within/qf(a,n2,Inf) # Within LCB
        withinUCB<-within/qf(1-a,n2,Inf) # Within UCB
    } else {
        withinLCB<-within/qf(a,n2,Inf) # Within LCB
        withinUCB<-within/qf(1-a,n2,Inf) # Within UCB
    }
#Mean Confidence Intervals
    if(debug){message(paste(grandmean,"+/-(sqrt(",Run,"/",n,")*qt(",a,",df=",I-1,"))",sep=""))} 
    meanLCB<-grandmean+(sqrt(Run/n)*qt(1-a,df=I-1)) # wrong
    meanUCB<-grandmean-(sqrt(Run/n)*qt(1-a,df=I-1)) # wrong
    if(debug){message(paste("Grandmean is ",grandmean,", meanLCB = ",meanLCB,", meanUCB = ",meanUCB,aresponse,sep=""))}
    if(debug){print(summary(assayaov))}
#Between Confidence Intervals
    G1<-1-(1/qf(a,n1,Inf)) 
    G2<-1-(1/qf(a,n2,Inf))
    H1<-(1/qf(1-a,n1,Inf))-1  
    H2<-(1/qf(1-a,n2,Inf))-1
    G12<-((qf(a,n1,n2)-1)^2-(G1^2*qf(a,n1,n2)^2)-(H2^2))/qf(a,n1,n2) 
    H12<-((1-qf(1-a,n1,n2))^2-H1^2*qf(1-a,n1,n2)^2-G2^2)/qf(1-a,n1,n2) 
    if(debug){message(paste("G1 is ",G1,", G2 is ",G2,sep=""))
        message(paste("H1 is ",H1,", H2 is ",H2,sep=""))
        message(paste("G12 is ",G12,", H12 is ",H12,sep=""))
    }
    if(balance){
        Vu<-H1^2*Run^2+G2^2*within^2+H12*Run*within
        Vl<-G1^2*Run^2+H2^2*within^2+G12*within*Run
        betweenLCB<-(Run-within-sqrt(Vl))/J # Betwen LCB
        betweenUCB<-(Run-within+sqrt(Vu))/J # Between UCB
    } else {
        #Burdick and Graybill seem to suggest calculating anova of mean values to find n1S12u/Jh
        meandataf<-ddply(.data=dataf,.variable=afactor, .fun=function(df){mean(with(df, get(aresponse)), na.rm=TRUE)})
        meandataaov<-aov(formula(paste("V1~",afactor,sep="")), data=meandataf)
        sumsquare<-summary(meandataaov)[[1]]$`Sum Sq`
        #so maybe S12u is just that bit ?
        Runu<-(sumsquare*Jh)/n1
        if(debug){message(paste("n1S12u/Jh is ",sumsquare,", so S12u is ",Runu,sep=""))}
        Vu<-H1^2*Runu^2+G2^2*within^2+H12*Runu*within
        Vl<-G1^2*Runu^2+H2^2*within^2+G12*within*Runu
        betweenLCB<-(Runu-within-sqrt(Vl))/Jh # Betwen LCB
        betweenUCB<-(Runu-within+sqrt(Vu))/Jh # Between UCB
        if(debug){message(paste("betweenLCB is ",betweenLCB,", between UCB is ",betweenUCB,sep=""))}
    }
#Total Confidence Intervals
    if(balance){
        y<-(Run+(J-1)*within)/J
        if(debug){message(paste("y is ",y,sep=""))}
        totalLCB<-y-(sqrt(G1^2*Run^2+G2^2*(J-1)^2*within^2)/J) # Total LCB
        totalUCB<-y+(sqrt(H1^2*Run^2+H2^2*(J-1)^2*within^2)/J) # Total UCB
    } else {
        y<-(Runu+(Jh-1)*within)/Jh
        if(debug){message(paste("y is ",y,sep=""))}
        totalLCB<-y-(sqrt(G1^2*Runu^2+G2^2*(Jh-1)^2*within^2)/Jh) # Total LCB
        totalUCB<-y+(sqrt(H1^2*Runu^2+H2^2*(Jh-1)^2*within^2)/Jh) # Total UCB
    }
    if(debug){message(paste("totalLCB is ",totalLCB,", total UCB is ",totalUCB,sep=""))}
#   result<-data.frame(Name=c("within", "between", "total"),CV=c(withinCV,betweenCV,totalCV),
#           LCB=c(sqrt(withinLCB)/grandmean*100,sqrt(betweenLCB)/grandmean*100,sqrt(totalLCB)/grandmean*100),
#           UCB=c(sqrt(withinUCB)/grandmean*100,sqrt(betweenUCB)/grandmean*100,sqrt(totalUCB)/grandmean*100))
    result<-data.frame(Mean=grandmean,MeanLCB=meanLCB, MeanUCB=meanUCB, Within=withinCV,WithinLCB=sqrt(withinLCB)/grandmean*100, WithinUCB=sqrt(withinUCB)/grandmean*100,
            Between=betweenCV, BetweenLCB=sqrt(betweenLCB)/grandmean*100, BetweenUCB=sqrt(betweenUCB)/grandmean*100,
            Total=totalCV, TotalLCB=sqrt(totalLCB)/grandmean*100, TotalUCB=sqrt(totalUCB)/grandmean*100)
    if(!digits=="NA"){
        result$Mean<-signif(result$Mean,digits=digits)
        result$MeanLCB<-signif(result$MeanLCB,digits=digits)
        result$MeanUCB<-signif(result$MeanUCB,digits=digits)
        result$Within<-signif(result$Within,digits=digits)
        result$WithinLCB<-signif(result$WithinLCB,digits=digits)
        result$WithinUCB<-signif(result$WithinUCB,digits=digits)
        result$Between<-signif(result$Between,digits=digits)
        result$BetweenLCB<-signif(result$BetweenLCB,digits=digits)
        result$BetweenUCB<-signif(result$BetweenUCB,digits=digits)
        result$Total<-signif(result$Total,digits=digits)
        result$TotalLCB<-signif(result$TotalLCB,digits=digits)
        result$TotalUCB<-signif(result$TotalUCB,digits=digits)
    }
    return(result)
}

assayvar<-function(adata, aresponse, afactor, anominal, aconfidence=95, digits=3, debug=FALSE){
    result<-ddply(adata,anominal,function(df){
                resul<-avar(dataf=df,afactor=afactor,aresponse=aresponse,aconfidence=aconfidence, digits=digits, debug=debug)
                resul$n<-nrow(subset(df, !is.na(with(df, get(aresponse)))))
                return(resul)
            })
    return(result)
}

这篇关于计算 R 中的方差和置信区间之内和之间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆