scikit学习系数多项式特征 [英] scikit learn coefficients polynomialfeatures

查看:51
本文介绍了scikit学习系数多项式特征的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 PolynomialFeatures 的帮助下拟合了一个模型,但我不知道如何获取模型的系数.代码如下:

将 numpy 导入为 np将熊猫导入为 pd从 sklearn.linear_model 导入 LinearRegression从 sklearn.preprocessing 导入 PolynomialFeatures从 sklearn.pipeline 导入 make_pipeline导入 matplotlib.pyplot 作为 pltX = np.matrix([0,1,2,3,4,5,6,7,8,9,10]).reshape((11,1))Y = np.matrix([0,2.2,3.5,14.3,20.4,32.1,40.3,59.1,86.2,90.3,99.9]).reshape((11,1))a = 多项式特征(15)模型 = make_pipeline(a, LinearRegression())模型拟合(X,Y)plt.plot(X,Y,'.')plt.plot(X, modelo.predict(X),'-')plt.show()

解决方案

让我们从使用二次多项式开始,而不是示例中的 15 次多项式,以简化您的问题(以及避免过度拟合).

使用您的 X,让我们看看值是如何转换的.

a = PolynomialFeatures(2)a.fit_transform(X)数组([[ 1., 0., 0.],[ 1., 1., 1.],[ 1., 2., 4.],[1., 3., 9.],[1., 4., 16.],[1., 5., 25.],[1., 6., 36.],[1., 7., 49.],[1., 8., 64.],[1., 9., 81.],[ 1., 10., 100.]])

我们可以看到第一个特征是X^0,第二个是X^1,第三个是X^2.

现在,使用您现有的代码,您正在构建一个包含两个步骤的管道,如 modelo.

我们可以使用 modelo.steps[1][1] 访问第二步的估算器.从那里我们可以使用 coef_ 获得系数,并使用 intercept_ 获得截距.

modelo.steps[1][1].coef_# [[ 0.3.3486014 0.76468531]]modelo.steps[1][1].intercept_# [-2.75244755]

从这里我们可以看到多项式是y_estimated = -2.75 + 0 * X^0 + 3.35 * X^1 + 0.76 * X^2

I have fit a model with the help of PolynomialFeatures, but I don't know how to grab the coefficients of the model. The code is the following:

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
import matplotlib.pyplot as plt

X = np.matrix([0,1,2,3,4,5,6,7,8,9,10]).reshape((11,1))
Y = np.matrix([0,2.2,3.5,14.3,20.4,32.1,40.3,  
           59.1,86.2,90.3,99.9]).reshape((11,1))
a = PolynomialFeatures(15)
modelo = make_pipeline(a, LinearRegression())
modelo.fit(X, Y)
plt.plot(X,Y,'.')
plt.plot(X, modelo.predict(X),'-')
plt.show()

解决方案

Let's begin by using a second degree polynomial, instead of 15 degree polynomial in your example, to simplify your problem (as well as to avoid overfitting).

Using your X let's see how the values are transformed.

a = PolynomialFeatures(2)
a.fit_transform(X)

array([[   1.,    0.,    0.],
       [   1.,    1.,    1.],
       [   1.,    2.,    4.],
       [   1.,    3.,    9.],
       [   1.,    4.,   16.],
       [   1.,    5.,   25.],
       [   1.,    6.,   36.],
       [   1.,    7.,   49.],
       [   1.,    8.,   64.],
       [   1.,    9.,   81.],
       [   1.,   10.,  100.]])

We can see that the first feature is X^0, second is X^1, third is X^2.

Now, using your existing code, you are building a pipeline of two steps as modelo.

We are able to access the second step's estimator using modelo.steps[1][1]. From there we can use coef_ to obtain the coefficients, and intercept_ to obtain the intercept.

modelo.steps[1][1].coef_
# [[ 0.          3.3486014   0.76468531]]

modelo.steps[1][1].intercept_
# [-2.75244755]

From here we can see that the polynomial is y_estimated = -2.75 + 0 * X^0 + 3.35 * X^1 + 0.76 * X^2

这篇关于scikit学习系数多项式特征的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆