如何获得smooth.spline的置信区间? [英] How to get confidence interval for smooth.spline?

查看:274
本文介绍了如何获得smooth.spline的置信区间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用smooth.spline来估计我的数据的三次样条.但是,当我使用公式计算90%的逐点置信区间时,结果似乎有些偏离.有人可以告诉我我做错了吗?我只是想知道是否有一个函数可以自动计算与smooth.spline函数相关的逐点间隔带.

I have used smooth.spline to estimate a cubic spline for my data. But when I calculate the 90% point-wise confidence interval using equation, the results seems to be a little bit off. Can someone please tell me if I did it wrongly? I am just wondering if there is a function that can automatically calculate a point-wise interval band associated with smooth.spline function.

boneMaleSmooth = smooth.spline( bone[males,"age"], bone[males,"spnbmd"], cv=FALSE)
error90_male = qnorm(.95)*sd(boneMaleSmooth$x)/sqrt(length(boneMaleSmooth$x))

plot(boneMaleSmooth, ylim=c(-0.5,0.5), col="blue", lwd=3, type="l", xlab="Age", 
     ylab="Relative Change in Spinal BMD")
points(bone[males,c(2,4)], col="blue", pch=20)
lines(boneMaleSmooth$x,boneMaleSmooth$y+error90_male, col="purple",lty=3,lwd=3)
lines(boneMaleSmooth$x,boneMaleSmooth$y-error90_male, col="purple",lty=3,lwd=3)

因为我不确定我是否正确执行了操作,所以我使用了mgcv包中的gam()函数.

Because I am not sure if I did it correctly, then I used gam() function from mgcv package.

它立即产生了置信度,但是我不确定它是90%还是95%CI或其他.如果有人可以解释,那就太好了.

It instantly gave a confidence band but I am not sure if it is 90% or 95% CI or something else. It would be great if someone can explain.

males=gam(bone[males,c(2,4)]$spnbmd ~s(bone[males,c(2,4)]$age), method = "GCV.Cp")
plot(males,xlab="Age",ylab="Relative Change in Spinal BMD")

推荐答案

我不确定smooth.spline的置信区间是否具有像lowess形式那样的不错"的置信区间.但是我从 CMU数据中找到了代码示例分析过程来确定贝叶斯自举置信区间.

I'm not sure the confidence intervals for smooth.spline have "nice" confidence intervals like those form lowess do. But I found a code sample from a CMU Data Analysis course to make Bayesian bootstap confidence intervals.

以下是使用的功能和示例.主要功能是spline.cis,其中第一个参数是数据帧,其中第一列是x值,第二列是y值.另一个重要参数是B,它指示要执行的引导程序复制次数. (有关完整的详细信息,请参见上面的链接PDF.)

Here are the functions used and an example. The main function is spline.cis where the first parameter is a data frame where the first column are the x values and the second column are the y values. The other important parameter is B which indicates the number bootstrap replications to do. (See the linked PDF above for the full details.)

# Helper functions
resampler <- function(data) {
    n <- nrow(data)
    resample.rows <- sample(1:n,size=n,replace=TRUE)
    return(data[resample.rows,])
}

spline.estimator <- function(data,m=300) {
    fit <- smooth.spline(x=data[,1],y=data[,2],cv=TRUE)
    eval.grid <- seq(from=min(data[,1]),to=max(data[,1]),length.out=m)
    return(predict(fit,x=eval.grid)$y) # We only want the predicted values
}

spline.cis <- function(data,B,alpha=0.05,m=300) {
    spline.main <- spline.estimator(data,m=m)
    spline.boots <- replicate(B,spline.estimator(resampler(data),m=m))
    cis.lower <- 2*spline.main - apply(spline.boots,1,quantile,probs=1-alpha/2)
    cis.upper <- 2*spline.main - apply(spline.boots,1,quantile,probs=alpha/2)
    return(list(main.curve=spline.main,lower.ci=cis.lower,upper.ci=cis.upper,
    x=seq(from=min(data[,1]),to=max(data[,1]),length.out=m)))
}

#sample data
data<-data.frame(x=rnorm(100), y=rnorm(100))

#run and plot
sp.cis <- spline.cis(data, B=1000,alpha=0.05)
plot(data[,1],data[,2])
lines(x=sp.cis$x,y=sp.cis$main.curve)
lines(x=sp.cis$x,y=sp.cis$lower.ci, lty=2)
lines(x=sp.cis$x,y=sp.cis$upper.ci, lty=2)

这给类似的东西

实际上,似乎可能存在更多采用参数的方法来使用折刀残差来计算置信区间.此代码来自 S +帮助页面,以使代码流畅.花键

Actually it looks like there might be a more parametric way to calculate confidence intervals using the jackknife residuals. This code comes from the S+ help page for smooth.spline

  fit <- smooth.spline(data$x, data$y)      # smooth.spline fit
  res <- (fit$yin - fit$y)/(1-fit$lev)      # jackknife residuals
sigma <- sqrt(var(res))                     # estimate sd

upper <- fit$y + 2.0*sigma*sqrt(fit$lev)   # upper 95% conf. band
lower <- fit$y - 2.0*sigma*sqrt(fit$lev)   # lower 95% conf. band
matplot(fit$x, cbind(upper, fit$y, lower), type="plp", pch=".")

结果

gam个置信区间而言,如果您阅读print.gam帮助文件,则会有一个se=参数(默认值为TRUE),文档会说

And as far as the gam confidence intervals go, if you read the print.gam help file, there is an se= parameter with default TRUE and the docs say

当TRUE(默认)上线和下线在绘制平滑度的估计值上下两个标准误差处添加到1-d图时,而对于2-d图来说,+ 1和-1标准误差处的曲面绘制轮廓并覆盖在轮廓图上以进行估计.如果提供正数,则在计算标准误差曲线或标准曲面时,将这个数字乘以标准误差.另请参见下面的阴影.

when TRUE (default) upper and lower lines are added to the 1-d plots at 2 standard errors above and below the estimate of the smooth being plotted while for 2-d plots, surfaces at +1 and -1 standard errors are contoured and overlayed on the contour plot for the estimate. If a positive number is supplied then this number is multiplied by the standard errors when calculating standard error curves or surfaces. See also shade, below.

因此,您可以通过调整此参数来调整置信区间. (这将在print()调用中.)

So you can adjust the confidence interval by adjusting this parameter. (This would be in the print() call.)

这篇关于如何获得smooth.spline的置信区间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆