mgcv:从GAM模型中平滑提取"tp"的结位置 [英] mgcv: Extract Knot Locations for `tp` smooth from a GAM model

查看:309
本文介绍了mgcv:从GAM模型中平滑提取"tp"的结位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从GAM模型中提取结的位置,以便将我的预测变量描述为其他模型的类别.我的数据包含一个二进制响应变量(已使用)和一个连续预测变量(打开).

I am trying to extract the placement of the knots from a GAM model in order to delineate my predictor variable into categories for another model. My data contains a binary response variable (used) and a continuous predictor (open).

data <- data.frame(Used = rep(c(1,0,0,0),1250),
                   Open = round(runif(5000,0,50), 0))

我适合GAM:

mod <- gam(Used ~ s(Open), binomial, data = data)

我可以在predict.gam函数中使用type=c("response", "lpmatrix")来获得预测值和模型矩阵等,但是我一直在努力提取系数变化的结点位置.任何建议都非常感谢!

I can get the predicted values, and the model matrix etc with either type=c("response", "lpmatrix") within the predict.gam function but I am struggling with out to extract the knot locations at which which the coefficients change. Any suggestion is really appreciated!

out<-as.data.frame(predict.gam(model1, newdata = newdat, type = "response"))

如果可能的话,我也很感兴趣:

I would also be interested if possible to do something like:

http://www.fromthebottomoftheheap .net/2014/05/15/identification-periods-of-change-with-gams/

其中已确定花键的统计增加/减少,但是,我此时未使用GAMM,因此,在识别从GAMM模型提取的GAM中的相似模型特征时遇到了问题.第二项是出于好奇而不是什么.

in which the statistical increase/decrease of the splines is identified, however, I am not using a GAMM at this point, and thus, am having problems identifying the similar model characteristics in GAM that are extracted from his GAMM model. This second item is more out of curiosity than anything.

推荐答案

评论:

  1. 询问时,您应该用Rmgcv标记您的问题;
  2. 首先,我想将您的问题标记为与 mgcv重复的问题:如何为P样条提取节,基,系数和预测是昨天提出的,并且我的答案应该很有用.但是后来我意识到实际上有一些区别.因此,我将在这里做一些简短的解释.
  1. You should have tagged your question with R and mgcv when asking;
  2. At first I want to flag your question as duplicate to mgcv: how to extract knots, basis, coefficients and predictions for P-splines in adaptive smooth? raised yesterday, and my answer there should be pretty useful. But then I realized that there is actually some difference. So I will make some brief explanation here.

答案:

在您的gam通话中:

mod <- gam(Used ~ s(Open), binomial, data = data)

您未在s()中指定bs自变量,因此将使用默认基础:bs = 'tp'.

you did not specify bs argument in s(), therefore the default basis: bs = 'tp' will be used.

'tp'薄板回归样条线的缩写,它不是具有常规结的平滑类.薄板样条线确实具有结:将结精确地放置在数据点上.例如,如果您具有n个唯一的Open值,则它具有n个结.在单变量情况下,这只是一条平滑样条线.

'tp', short for thin-plate regression spline, is not a smooth class that has conventional knots. Thin plate spline does have knots: it places knots exactly at data points. For example, if you have n unique Open values, then it has n knots. In univariate case, this is just a smoothing spline.

但是,基于截断特征分解,薄板回归样条线是整个薄板样条线的低阶近似.这与主要成分分析(PCA)类似.代替使用原始的n薄板样条曲线编号,它使用第一个k主成分.这样可以将计算复杂度从O(n^3)降低到O(nk^2),同时确保最佳的rank-k逼近度.

However, thin plate regression spline is a low rank approximation to full thin-plate spline, based on truncated eigen decomposition. This is a similar idea to principal components analysis(PCA). Instead of using the original n number of thin-plate spline basis, it uses the first k principal components. This reduces computation complexity from O(n^3) down to O(nk^2), while ensuring optimal rank-k approximation.

因此,对于拟合的薄板回归样条线,您确实可以解开任何结.

As a result, there is really no knots you can extract for a fitted thin-plate regression spline.

由于您使用单变量样条曲线,所以实际上没有必要使用'tp'.只需使用 c ubic r 出口样条线bs = 'cr'.当tp可用时,这是mgcv中2003年之前的默认设置. cr有结,您可以提取结,如我在我的答案中所示.不要在这个问题上被bs = 'ad'所迷惑:P样条,B样条,自然三次样条都是基于节的样条.

Since you work with univariate spline, there is really no need to go for 'tp'. Just use bs = 'cr', the cubic regression spline. This used to be the default in mgcv before 2003, when tp became available. cr has knots, and you can extract knots as I showed in my answer. Don't be confused by the bs = 'ad' in that question: P-splines, B-splines, natural cubic splines, are all knots-based splines.

这篇关于mgcv:从GAM模型中平滑提取"tp"的结位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆