为什么不建议从 glmnet 模型中获取回归系数的统计汇总信息? [英] Why is it inadvisable to get statistical summary information for regression coefficients from glmnet model?

查看:36
本文介绍了为什么不建议从 glmnet 模型中获取回归系数的统计汇总信息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个二元结果的回归模型.我用 glmnet 拟合模型并得到选定的变量及其系数.

I have a regression model with binary outcome. I fitted the model with glmnet and got the selected variables and their coefficients.

由于 glmnet 不计算变量重要性,我想将确切的输出(选定的变量及其系数)提供给 glm 以获取信息(标准错误等).

Since glmnet doesn't calculate variable importance, I would like to feed the exact output (selected variables and their coefficients) to glm to get the information (Standard errors, etc).

我搜索了 r 个文档,看来我可以在 glm 中使用方法"选项来指定用户定义的函数.但是我没有这样做,有人可以帮助我吗?

I searched r documents, it seems I can use "method" option in glm to specify user defined function. But I failed to do so, could someone help me with this?

推荐答案

问回归的标准误差是一个很自然的问题系数或其他估计量.原则上此类标准错误可以很容易地计算出来,例如使用引导程序.

"It is a very natural question to ask for standard errors of regression coefficients or other estimated quantities. In principle such standard errors can easily be calculated, e.g. using the bootstrap.

不过,这个包故意不提供它们.的原因这是标准误差对于强烈的意义不大有偏差的估计,例如由惩罚性估计方法引起的.惩罚估计是一种减少方差的过程通过引入大量偏差来估计量.每个估计量的偏差因此是其均方误差的主要组成部分,而它的差异可能只贡献一小部分.

Still, this package deliberately does not provide them. The reason for this is that standard errors are not very meaningful for strongly biased estimates such as arise from penalized estimation methods. Penalized estimation is a procedure that reduces the variance of estimators by introducing substantial bias. The bias of each estimator is therefore a major component of its mean squared error, whereas its variance may contribute only a small part.

不幸的是,在大多数惩罚回归的应用中,它是不可能获得足够精确的偏差估计.任何基于引导的计算只能评估估计的方差.对偏差的可靠估计仅如果可以获得可靠的无偏估计,则可用,即在惩罚估计数为用过.

Unfortunately, in most applications of penalized regression it is impossible to obtain a sufficiently precise estimate of the bias. Any bootstrap-based calculations can only give an assessment of the variance of the estimates. Reliable estimates of the bias are only available if reliable unbiased estimates are available, which is typically not the case in situations in which penalized estimates are used.

报告受罚估计的标准误差因此告诉只是故事的一部分.它会给人一种很棒的错误印象精度,完全忽略由偏差引起的不准确.它做出仅适用于基于对估计方差的评估,例如基于 bootstrap 的置信区间确实如此."

Reporting a standard error of a penalized estimate therefore tells only part of the story. It can give a mistaken impression of great precision, completely ignoring the inaccuracy caused by the bias. It is certainly a mistake to make confidence statements that are only based on an assessment of the variance of the estimates, such as bootstrap-based confidence intervals do."

Jelle Goeman,博士莱顿大学,R 中 Penalized 包的作者.

这篇关于为什么不建议从 glmnet 模型中获取回归系数的统计汇总信息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆