Scikit-learn 岭回归与非正则化截距项 [英] Scikit-learn Ridge Regression with unregularized intercept term

查看:53
本文介绍了Scikit-learn 岭回归与非正则化截距项的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

scikit-learn 岭回归在正则化项中是否包含截距系数,如果是,有没有办法在不正则化截距的情况下运行岭回归?

Does the scikit-learn Ridge regression include the intercept coefficient in the regularization term, and if so, is there a way to run ridge regression without regularizing the intercept?

假设我适合岭回归:

from sklearn import linear_model

mymodel = linear_model.Ridge(alpha=0.1, fit_intercept=True).fit(X, y)
print mymodel.coef_
print mymodel.intercept_

对于某些数据 X, y,其中 X 不包括一列 1.fit_intercept=True 会自动添加截距列,对应的系数由mymodel.intercept_给出.我无法弄清楚的是,这个截距系数是否是优化目标中正则化求和的一部分.

for some data X, y where X does not include a column of 1's. fit_intercept=True will automatically add an intercept column, and the corresponding coefficient is given by mymodel.intercept_. What I'm unable to figure out is whether this intercept coefficient was part of the regularization summation in the optimization objective.

根据http://scikit-learn.org/stable/modules/linear_model.html,优化目标是关于 w 的最小化:

According to http://scikit-learn.org/stable/modules/linear_model.html, the optimization objective is to minimize with respect to w:

||X*w - y||**2 + alpha* ||w||**2

||X*w - y||**2 + alpha* ||w||**2

(使用 L2 范数).第二项是正则化项,问题是在我们设置fit_intercept=True的情况下是否包含截距系数;如果是这样,如何禁用它.

(using the L2 norm). The second term is the regularization term, and the question is whether it includes the intercept coefficient in the case where we set fit_intercept=True; and if so, how to disable this.

推荐答案

拦截不会受到惩罚.试试一个简单的 3 点示例,截距很大.

The intercept is not penalized. Just try a simple 3 point example with a large intercept.

from sklearn import linear_model
import numpy as np

x=np.array([-1,0,1]).reshape((3,1))
y=np.array([1001,1002,1003])
fit=linear_model.Ridge(alpha=0.1,fit_intercept=True).fit(x,y)

print fit.intercept_
print fit.coef_

截距设置为 MLE 截距 (1002),而斜率受到惩罚(0.952 而不是 1).

The intercept was set to the MLE intercept (1002), while the slope was penalized (.952 instead of 1).

这篇关于Scikit-learn 岭回归与非正则化截距项的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆