为什么不建议从glmnet模型获取回归系数的统计摘要信息? [英] Why is it inadvisable to get statistical summary information for regression coefficients from glmnet model?

查看:215
本文介绍了为什么不建议从glmnet模型获取回归系数的统计摘要信息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有二进制结果的回归模型.我用glmnet拟合了模型,并获得了选定的变量及其系数.

I have a regression model with binary outcome. I fitted the model with glmnet and got the selected variables and their coefficients.

由于glmnet不会计算变量的重要性,因此我想将确切的输出(选定的变量及其系数)提供给glm,以获取信息(标准错误等).

Since glmnet doesn't calculate variable importance, I would like to feed the exact output (selected variables and their coefficients) to glm to get the information (Standard errors, etc).

我搜索了r个文档,看来我可以在glm中使用方法"选项来指定用户定义的功能. 但是我没有这样做,有人可以帮我吗?

I searched r documents, it seems I can use "method" option in glm to specify user defined function. But I failed to do so, could someone help me with this?

推荐答案

问回归的标准误差是一个很自然的问题 系数或其他估计数量.原则上这样的标准 可以很容易地计算出误差,例如使用引导程序.

"It is a very natural question to ask for standard errors of regression coefficients or other estimated quantities. In principle such standard errors can easily be calculated, e.g. using the bootstrap.

仍然,此程序包故意不提供它们.的原因 这是因为标准错误对于 有偏估计,例如从惩罚估计方法中得出的估计. 惩罚估计是减少 通过引入实质性偏见来估算.每个估计量的偏差 因此,它是均方误差的主要组成部分,而 变化可能只占一小部分.

Still, this package deliberately does not provide them. The reason for this is that standard errors are not very meaningful for strongly biased estimates such as arise from penalized estimation methods. Penalized estimation is a procedure that reduces the variance of estimators by introducing substantial bias. The bias of each estimator is therefore a major component of its mean squared error, whereas its variance may contribute only a small part.

不幸的是,在惩罚回归的大多数应用中 无法获得足够精确的偏差估算值.任何 基于引导程序的计算只能对 估计值的方差.仅对偏差进行可靠的估算 如果可获得可靠的无偏估计,则可用 通常在惩罚性估计为 使用.

Unfortunately, in most applications of penalized regression it is impossible to obtain a sufficiently precise estimate of the bias. Any bootstrap-based calculations can only give an assessment of the variance of the estimates. Reliable estimates of the bias are only available if reliable unbiased estimates are available, which is typically not the case in situations in which penalized estimates are used.

因此,报告惩罚性估计的标准错误可以告诉我们 只是故事的一部分.可能会给人以错误的印象 精度,完全忽略了偏差所引起的误差.它 做出仅是信心的声明肯定是错误的 基于对估算值方差的评估,例如 基于引导程序的置信区间."

Reporting a standard error of a penalized estimate therefore tells only part of the story. It can give a mistaken impression of great precision, completely ignoring the inaccuracy caused by the bias. It is certainly a mistake to make confidence statements that are only based on an assessment of the variance of the estimates, such as bootstrap-based confidence intervals do."

Jelle Goeman博士.莱顿大学,R中的惩罚性"软件包的作者.

这篇关于为什么不建议从glmnet模型获取回归系数的统计摘要信息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆