B样条混乱 [英] B Spline confusion

查看:111
本文介绍了B样条混乱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我意识到板上有一些有关B样条曲线的帖子,但实际上使我更加困惑,所以我认为有人可能会帮助我.

我具有x值从0到1的模拟数据.我想将其结点为0、0.1、0.2,...,0.9、1的三次样条(degree = 3)拟合到我的数据中我也想使用B样条基础和OLS进行参数估计(我不是在寻找罚样条).

我想我需要spline包中的bs函数,但是我不太确定,我也不知道该怎么喂它.

我还要绘制所得的多项式样条曲线.

谢谢!

解决方案

## simulate some data - from mgcv::magic
set.seed(1)
n <- 400
x <- 0:(n-1)/(n-1)
f <- 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
y <- f + rnorm(n, 0, sd = 2)

## load the splines package - comes with R
require(splines)

根据需要使用OLS估算值,可以在lm的公式中使用bs()函数. bs提供由结,多项式的阶数等给定的基函数.

mod <- lm(y ~ bs(x, knots = seq(0.1, 0.9, by = 0.1)))

您可以像对待线性模型一样对待它.

> anova(mod)
Analysis of Variance Table

Response: y
                                        Df Sum Sq Mean Sq F value    Pr(>F)    
bs(x, knots = seq(0.1, 0.9, by = 0.1))  12 2997.5 249.792  65.477 < 2.2e-16 ***
Residuals                              387 1476.4   3.815                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

关于结位置的一些指针. bs有一个参数Boundary.knots,默认值为Boundary.knots = range(x)-因此,当我在上面指定knots参数时,我没有包括边界结.

阅读?bs了解更多信息.

生成拟合样条线的图

在评论中,我讨论了如何绘制拟合的样条曲线.一种选择是按照协变量对数据进行排序.这适用于单个协变量,但不必适用于2个或多个协变量.另一个问题是,您只能在观察到的x值上评估拟合的样条-如果对协变量进行了密集采样,这很好,但是如果没有,则样条可能看起来很奇怪,具有长的线性截面. >

更通用的解决方案是使用predict从模型为协变量或协变量的新值生成预测.在下面的代码中,我演示了如何对上述模型执行此操作,并预测了x范围内的100个均匀间隔的值.

pdat <- data.frame(x = seq(min(x), max(x), length = 100))
## predict for new `x`
pdat <- transform(pdat, yhat = predict(mod, newdata = pdat))

## now plot
ylim <- range(pdat$y, y) ## not needed, but may be if plotting CIs too
plot(y ~ x)
lines(yhat ~ x, data = pdat, lwd = 2, col = "red")

产生

I realise that there are posts on the topic of B-Splines on this board but those have actually made me more confused so I thought someone might be able to help me.

I have simulated data for x-values ranging from 0 to 1. I'd like to fit to my data a cubic spline (degree = 3) with knots at 0, 0.1, 0.2, ... , 0.9, 1. I'd also like to use the B-Spline basis and OLS for parameter estimation (I'm not looking for penalised splines).

I think I need the bs function from the spline package but I'm not quite sure and I also don't know what exactly to feed it.

I'd also like to plot the resulting polynomial spline.

Thanks!

解决方案

## simulate some data - from mgcv::magic
set.seed(1)
n <- 400
x <- 0:(n-1)/(n-1)
f <- 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
y <- f + rnorm(n, 0, sd = 2)

## load the splines package - comes with R
require(splines)

You use the bs() function in a formula to lm as you want OLS estimates. bs provides the basis functions as given by the knots, degree of polynomial etc.

mod <- lm(y ~ bs(x, knots = seq(0.1, 0.9, by = 0.1)))

You can treat that just like a linear model.

> anova(mod)
Analysis of Variance Table

Response: y
                                        Df Sum Sq Mean Sq F value    Pr(>F)    
bs(x, knots = seq(0.1, 0.9, by = 0.1))  12 2997.5 249.792  65.477 < 2.2e-16 ***
Residuals                              387 1476.4   3.815                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Some pointers on knot placement. bs has an argument Boundary.knots, with default Boundary.knots = range(x) - hence when I specified the knots argument above, I did not include the boundary knots.

Read ?bs for more information.

Producing a plot of the fitted spline

In the comments I discuss how to draw the fitted spline. One option is to order the data in terms of the covariate. This works fine for a single covariate, but need not work for 2 or more covariates. A further issue is that you can only evaluate the fitted spline at the observed values of x - this is fine if you have densely sampled the covariate, but if not, the spline may look odd, with long linear sections.

A more general solution is to use predict to generate predictions from the model for new values of the covariate or covariates. In the code below I show how to do this for the model above, predicting for 100 evenly-spaced values over the range of x.

pdat <- data.frame(x = seq(min(x), max(x), length = 100))
## predict for new `x`
pdat <- transform(pdat, yhat = predict(mod, newdata = pdat))

## now plot
ylim <- range(pdat$y, y) ## not needed, but may be if plotting CIs too
plot(y ~ x)
lines(yhat ~ x, data = pdat, lwd = 2, col = "red")

That produces

这篇关于B样条混乱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆