如何在scikit-learn中实现多项式逻辑回归? [英] How to implement polynomial logistic regression in scikit-learn?

查看:72
本文介绍了如何在scikit-learn中实现多项式逻辑回归?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建非线性logistic回归,即使用scikit-learn进行多项式logistic回归.但是我找不到如何定义多项式的度数.有人尝试过吗?非常感谢!

I'm trying to create a non-linear logistic regression, i.e. polynomial logistic regression using scikit-learn. But I couldn't find how I can define a degree of polynomial. Did anybody try it? Thanks a lot!

推荐答案

为此,您需要分两步进行.让我们假设您正在使用虹膜数据集(因此您有一个可重现的示例):

For this you will need to proceed in two steps. Let us assume you are using the iris dataset (so you have a reproducible example):

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline

data = load_iris()
X = data.data
y = data.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

步骤1

首先,您需要将数据转换为多项式特征.最初,我们的数据有4列:

Step 1

First you need to convert your data to polynomial features. Originally, our data has 4 columns:

X_train.shape
>>> (112,4)

您可以使用scikit Learn创建多项式特征(这里是2级):

You can create the polynomial features with scikit learn (here it is for degree 2):

poly = PolynomialFeatures(degree = 2, interaction_only=False, include_bias=False)
X_poly = poly.fit_transform(X_train)
X_poly.shape
>>> (112,14)

我们知道有14个要素(原始的4个,正方形和6个交叉的组合)

We know have 14 features (the original 4, their square, and the 6 crossed combinations)

现在您可以基于此构建逻辑回归,调用 X_poly

On this you can now build your logistic regression calling X_poly

lr = LogisticRegression()
lr.fit(X_poly,y_train)

注意:如果您随后要根据测试数据评估模型,则还需要按照以下两个步骤进行操作:

Note: if you then want to evaluate your model on the test data, you also need to follow these 2 steps and do:

lr.score(poly.transform(X_test), y_test)

将所有内容放到管道中(可选)

您可能希望使用在一个对象中处理以下两个步骤的管道,以避免构建中间对象:

Putting everything together in a Pipeline (optional)

You may want to use a Pipeline instead that processes these two steps in one object to avoid building intermediary objects:

pipe = Pipeline([('polynomial_features',poly), ('logistic_regression',lr)])
pipe.fit(X_train, y_train)
pipe.score(X_test, y_test)

这篇关于如何在scikit-learn中实现多项式逻辑回归?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆