使用SciPy进行逻辑回归 [英] Logistic regression using SciPy

查看:139
本文介绍了使用SciPy进行逻辑回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用SciPy fmin_bfgs函数在Python中编写逻辑回归,但遇到了一些问题.我编写了用于逻辑(S型)转换函数和成本函数的函数,这些函数运行良好(我使用了通过固定软件找到的参数向量的优化值来测试这些函数,并且这些函数匹配了).我不确定我是否实现了渐变函数,但是看起来很合理.

I am trying to code up logistic regression in Python using the SciPy fmin_bfgs function, but am running into some issues. I wrote functions for the logistic (sigmoid) transformation function, and the cost function, and those work fine (I have used the optimized values of the parameter vector found via canned software to test the functions, and those match up). I am not that sure of my implementation of the gradient function, but it looks reasonable.

这是代码:

# purpose: logistic regression 
import numpy as np
import scipy.optimize

# prepare the data
data = np.loadtxt('data.csv', delimiter=',', skiprows=1)
vY = data[:, 0]
mX = data[:, 1:]
intercept = np.ones(mX.shape[0]).reshape(mX.shape[0], 1)
mX = np.concatenate((intercept, mX), axis = 1)
iK = mX.shape[1]
iN = mX.shape[0]

# logistic transformation
def logit(mX, vBeta):
    return((1/(1.0 + np.exp(-np.dot(mX, vBeta)))))

# test function call
vBeta0 = np.array([-.10296645, -.0332327, -.01209484, .44626211, .92554137, .53973828, 
    1.7993371, .7148045  ])
logit(mX, vBeta0)

# cost function
def logLikelihoodLogit(vBeta, mX, vY):
    return(-(np.sum(vY*np.log(logit(mX, vBeta)) + (1-vY)*(np.log(1-logit(mX, vBeta))))))
logLikelihoodLogit(vBeta0, mX, vY) # test function call

# gradient function
def likelihoodScore(vBeta, mX, vY):
    return(np.dot(mX.T, 
                  ((np.dot(mX, vBeta) - vY)/
                   np.dot(mX, vBeta)).reshape(iN, 1)).reshape(iK, 1))

likelihoodScore(vBeta0, mX, vY).shape # test function call

# optimize the function (without gradient)
optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, 
                                  x0 = np.array([-.1, -.03, -.01, .44, .92, .53,
                                            1.8, .71]), 
                                  args = (mX, vY), gtol = 1e-3)

# optimize the function (with gradient)
optimLogit = scipy.optimize.fmin_bfgs(logLikelihoodLogit, 
                                  x0 = np.array([-.1, -.03, -.01, .44, .92, .53,
                                            1.8, .71]), fprime = likelihoodScore, 
                                  args = (mX, vY), gtol = 1e-3)

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆