对样条线使用bs()函数时如何解释lm()系数估计 [英] How to interpret lm() coefficient estimates when using bs() function for splines

查看:452
本文介绍了对样条线使用bs()函数时如何解释lm()系数估计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用一组点,这些点以对称的V形从(-5,5)(0,0)(5,5).我正在使用lm()bs()函数拟合模型以拟合"V形"样条线:

I'm using a set of points which go from (-5,5) to (0,0) and (5,5) in a "symmetric V-shape". I'm fitting a model with lm() and the bs() function to fit a "V-shape" spline:

lm(formula = y ~ bs(x, degree = 1, knots = c(0)))

当我通过predict()预测结果并绘制预测线时,我得到"V形".但是,当我查看模型估计值coef()时,会看到我没有想到的估计值.

I get the "V-shape" when I predict outcomes by predict() and draw the prediction line. But when I look at the model estimates coef(), I see estimates that I don't expect.

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)  
(Intercept)                       4.93821    0.16117  30.639 1.40e-09 ***
bs(x, degree = 1, knots = c(0))1 -5.12079    0.24026 -21.313 2.47e-08 ***
bs(x, degree = 1, knots = c(0))2 -0.05545    0.21701  -0.256    0.805 

我期望第一部分的-1系数和第二部分的+1系数.我是否必须以其他方式解释估算值?

I would expect a -1 coefficient for the first part and a +1 coefficient for the second part. Must I interpret the estimates in a different way?

如果我手动填充lm()函数中的结,则得到以下系数:

If I fill the knot in the lm() function manually than I get these coefficients:

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.18258    0.13558  -1.347    0.215    
x           -1.02416    0.04805 -21.313 2.47e-08 ***
z            2.03723    0.08575  23.759 1.05e-08 ***

更像是. Z对结点(相对于x的相对变化)约为+1

That's more like it. Z's (point of knot) relative change to x is ~ +1

我想了解如何解释bs()结果.我检查过,手册和bs模型的预测值完全相同.

I want to understand how to interpret the bs() result. I've checked, the manual and bs model prediction values are exact the same.

推荐答案

我希望第一部分的系数为-1,第二部分的系数为+1.

I would expect a -1 coefficient for the first part and a +1 coefficient for the second part.

我认为您的问题确实与什么是B样条函数有关.如果要了解系数的含义,则需要知道样条曲线的基函数是什么.请参阅以下内容:

I think your question is really about what is a B-spline function. If you want to understand the meaning of coefficients, you need to know what basis functions are for your spline. See the following:

library(splines)
x <- seq(-5, 5, length = 100)
b <- bs(x, degree = 1, knots = 0)  ## returns a basis matrix
str(b)  ## check structure
b1 <- b[, 1]  ## basis 1
b2 <- b[, 2]  ## basis 2
par(mfrow = c(1, 2))
plot(x, b1, type = "l", main = "basis 1: b1")
plot(x, b2, type = "l", main = "basis 2: b2")

注意:

  1. 1级的B样条是帐篷函数,如您从b1;
  2. 所见
  3. 1度的B样条被缩放,因此它们的功能值在(0, 1);
  4. 之间 B样条的
  5. a的弯曲的位置
  6. 1级的B样条是紧凑,并且仅在(不超过)三个相邻结处为非零值.
  1. B-splines of degree-1 are tent functions, as you can see from b1;
  2. B-splines of degree-1 are scaled, so that their functional value is between (0, 1);
  3. a knots of a B-spline of degree-1 is where it bends;
  4. B-splines of degree-1 are compact, and are only non-zero over (no more than) three adjacent knots.

您可以从 B样条的定义中获取B样条的(递归)表达式.程度为0的B样条是最基类,而

You can get the (recursive) expression of B-splines from Definition of B-spline. B-spline of degree 0 is the most basis class, while

  • 1级B样条是0级B样条的线性组合
  • 2级B样条是1级B样条的线性组合
  • 3级B样条是2级B样条的线性组合

(对不起,我没话题了...)

使用B样条曲线进行线性回归:

Your linear regression using B-splines:

y ~ bs(x, degree = 1, knots = 0)

只是在做:

y ~ b1 + b2

现在,您应该能够理解平均系数,这意味着样条函数为:

Now, you should be able to understand what coefficient you get mean, it means that the spline function is:

-5.12079 * b1 - 0.05545 * b2

汇总表中:

Coefficients:
                                 Estimate Std. Error t value Pr(>|t|)  
(Intercept)                       4.93821    0.16117  30.639 1.40e-09 ***
bs(x, degree = 1, knots = c(0))1 -5.12079    0.24026 -21.313 2.47e-08 ***
bs(x, degree = 1, knots = c(0))2 -0.05545    0.21701  -0.256    0.805 

您可能想知道为什么b2的系数不重要.好吧,比较您的yb1:您的y对称V形,而b1反向对称V形.如果先将-1乘以b1,然后乘以5来重新缩放比例(这解释了b1的系数-5),您将得到什么?很好的搭配,对不对?因此不需要b2.

You might wonder why the coefficient of b2 is not significant. Well, compare your y and b1: Your y is symmetric V-shape, while b1 is reverse symmetric V-shape. If you first multiply -1 to b1, and rescale it by multiplying 5, (this explains the coefficient -5 for b1), what do you get? Good match, right? So there is no need for b2.

但是,如果您的y是不对称的,将(-5,5)穿过槽(0,0),然后又到(5,10),则您会注意到b1b2的系数都是有效的.我认为其他答案已经给您提供了这样的例子.

However, if your y is asymmetric, running trough (-5,5) to (0,0), then to (5,10), then you will notice that coefficients for b1 and b2 are both significant. I think the other answer already gave you such example.

此处显示了拟合的B样条到分段多项式的重新参数化:将拟合的回归样条作为分段多项式和导出多项式系数进行重新参数化.

Reparametrization of fitted B-spline to piecewise polynomial is demonstrated here: Reparametrize fitted regression spline as piece-wise polynomials and export polynomial coefficients.

这篇关于对样条线使用bs()函数时如何解释lm()系数估计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆