如何在线性回归中计算正则化参数 [英] How to calculate the regularization parameter in linear regression

查看:339
本文介绍了如何在线性回归中计算正则化参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我们有一个高阶线性多项式用于拟合线性回归设置中的一组点时,为了防止过度拟合,我们将使用正则化,并在成本函数中包含一个lambda参数.然后,该lambda用于更新梯度下降算法中的theta参数.

When we have a high degree linear polynomial that is used to fit a set of points in a linear regression setup, to prevent overfitting, we use regularization, and we include a lambda parameter in the cost function. This lambda is then used to update the theta parameters in the gradient descent algorithm.

我的问题是我们如何计算该lambda正则化参数?

My question is how do we calculate this lambda regularization parameter?

推荐答案

正则化参数(lambda)是模型的输入,因此您可能想知道的是如何选择该值的lambda.正则化参数减少了过拟合,从而减小了估计的回归参数的方差;但是,这样做会以增加估计的偏见为代价. lambda的增加会导致过度拟合的减少,但也会带来更大的偏差.因此,真正的问题是您愿意承受多少偏差?"

The regularization parameter (lambda) is an input to your model so what you probably want to know is how do you select the value of lambda. The regularization parameter reduces overfitting, which reduces the variance of your estimated regression parameters; however, it does this at the expense of adding bias to your estimate. Increasing lambda results in less overfitting but also greater bias. So the real question is "How much bias are you willing to tolerate in your estimate?"

您可以采取的一种方法是,对数据进行多次随机抽样,然后查看估算值的变化.然后为稍大的lambda值重复该过程,以查看它如何影响估计的可变性.请记住,无论您决定的lambda值如何适合于二次抽样数据,都可以使用较小的值对整个数据集进行可比的正则化.

One approach you can take is to randomly subsample your data a number of times and look at the variation in your estimate. Then repeat the process for a slightly larger value of lambda to see how it affects the variability of your estimate. Keep in mind that whatever value of lambda you decide is appropriate for your subsampled data, you can likely use a smaller value to achieve comparable regularization on the full data set.

这篇关于如何在线性回归中计算正则化参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆