发生python scipy.optimize.fmin_l_bfgs_b错误 [英] Python scipy.optimize.fmin_l_bfgs_b error occurs
问题描述
我的代码是使用L-BFGS优化来实现主动学习算法.我想优化四个参数:alpha
,beta
,w
和gamma
.
My code is to implement an active learning algorithm, using L-BFGS optimization. I want to optimize four parameters: alpha
, beta
, w
and gamma
.
但是,当我运行下面的代码时,出现了错误:
However, when I run the code below, I got an error:
optimLogitLBFGS = sp.optimize.fmin_l_bfgs_b(func, x0 = x0, args = (X,Y,Z), fprime = func_grad)
File "C:\Python27\lib\site-packages\scipy\optimize\lbfgsb.py", line 188, in fmin_l_bfgs_b
**opts)
File "C:\Python27\lib\site-packages\scipy\optimize\lbfgsb.py", line 311, in _minimize_lbfgsb
isave, dsave)
_lbfgsb.error: failed in converting 7th argument ``g' of _lbfgsb.setulb to C/Fortran array
0-th dimension must be fixed to 22 but got 4
我的代码是:
# -*- coding: utf-8 -*-
import numpy as np
import scipy as sp
import scipy.stats as sps
num_labeler = 3
num_instance = 5
X = np.array([[1,1,1,1],[2,2,2,2],[3,3,3,3],[4,4,4,4],[5,5,5,5]])
Z = np.array([1,0,1,0,1])
Y = np.array([[1,0,1],[0,1,0],[0,0,0],[1,1,1],[1,0,0]])
W = np.array([[1,1,1,1],[2,2,2,2],[3,3,3,3]])
gamma = np.array([1,1,1,1,1])
alpha = np.array([1,1,1,1])
beta = 1
para = np.array([1,1,1,1,1,1,1,1,1,2,2,2,2,3,3,3,3,1,1,1,1,1])
def get_params(para):
# extract parameters from 1D parameter vector
assert len(para) == 22
alpha = para[0:4]
beta = para[4]
W = para[5:17].reshape(3, 4)
gamma = para[17:]
return alpha, beta, gamma, W
def log_p_y_xz(yit,zi,sigmati): #log P(y_it|x_i,z_i)
return np.log(sps.norm(zi,sigmati).pdf(yit))#tested
def log_p_z_x(alpha,beta,xi): #log P(z_i=1|x_i)
return -np.log(1+np.exp(-np.dot(alpha,xi)-beta))#tested
def sigma_eta_ti(xi, w_t, gamma_t): # 1+exp(-w_t x_i -gamma_t)^-1
return 1/(1+np.exp(-np.dot(xi,w_t)-gamma_t)) #tested
def df_alpha(X,Y,Z,W,alpha,beta,gamma):#df/dalpha
return np.sum((2/(1+np.exp(-np.dot(alpha,X[i])-beta))-1)*np.exp(-np.dot(alpha,X[i])-beta)*X[i]/(1+np.exp(-np.dot(alpha,X[i])-beta))**2 for i in range (num_instance))
#tested
def df_beta(X,Y,Z,W,alpha,beta,gamma):#df/dbelta
return np.sum((2/(1+np.exp(-np.dot(alpha,X[i])-beta))-1)*np.exp(-np.dot(alpha,X[i])-beta)/(1+np.exp(-np.dot(alpha,X[i])-beta))**2 for i in range (num_instance))
def df_w(X,Y,Z,W,alpha,beta,gamma):#df/sigma * sigma/dw
return np.sum(np.sum((-3)*(Y[i][t]**2-(-np.log(1+np.exp(-np.dot(alpha,X[i])-beta)))*(2*Y[i][t]-1))*(1/(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))**4)*(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))*(1-(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t]))))*X[i]+(1/(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))**2)*(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))*(1-(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t]))))*X[i]for t in range(num_labeler)) for i in range (num_instance))
def df_gamma(X,Y,Z,W,alpha,beta,gamma):#df/sigma * sigma/dgamma
return np.sum(np.sum((-3)*(Y[i][t]**2-(-np.log(1+np.exp(-np.dot(alpha,X[i])-beta)))*(2*Y[i][t]-1))*(1/(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))**4)*(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))*(1-(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t]))))+(1/(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))**2)*(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))*(1-(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t]))))for t in range(num_labeler)) for i in range (num_instance))
def func(para, *args):
alpha, beta, gamma, W = get_params(para)
#args
X = args [0]
Y = args[1]
Z = args[2]
return np.sum(np.sum(log_p_y_xz(Y[i][t], Z[i], sigma_eta_ti(X[i],W[t],gamma[t]))+log_p_z_x(alpha, beta, X[i]) for t in range(num_labeler)) for i in range (num_instance))
#tested
def func_grad(para, *args):
alpha, beta, gamma, W = get_params(para)
#args
X = args [0]
Y = args[1]
Z = args[2]
#gradiants
d_f_a = df_alpha(X,Y,Z,W,alpha,beta,gamma)
d_f_b = df_beta(X,Y,Z,W,alpha,beta,gamma)
d_f_w = df_w(X,Y,Z,W,alpha,beta,gamma)
d_f_g = df_gamma(X,Y,Z,W,alpha,beta,gamma)
return np.array([d_f_a, d_f_b,d_f_w,d_f_g])
x0 = np.concatenate([np.ravel(alpha), np.ravel(beta), np.ravel(W), np.ravel(gamma)])
optimLogitLBFGS = sp.optimize.fmin_l_bfgs_b(func, x0 = x0, args = (X,Y,Z), fprime = func_grad)
我不确定是什么问题.也许func_grad
引起了问题?有人可以看看吗?谢谢
I am not sure what is the problem. Maybe, the func_grad
cause the problem? Could anyone have a look? thanks
推荐答案
您需要对alpha, beta, w, gamma
参数的串联数组中的每个元素采用func
的导数,因此func_grad
应该返回与x0
具有相同长度(即22)的单个一维数组.相反,它返回两个数组和嵌套在np.object
数组内的两个标量浮点的混杂物:
You need to be taking the derivative of func
with respect to each of the elements in your concatenated array of alpha, beta, w, gamma
parameters, so func_grad
ought to return a single 1D array of the same length as x0
(i.e. 22). Instead it returns a jumble of two arrays and two scalar floats nested inside an np.object
array:
In [1]: func_grad(x0, X, Y, Z)
Out[1]:
array([array([ 0.00681272, 0.00681272, 0.00681272, 0.00681272]),
0.006684719133999417,
array([-0.01351227, -0.01351227, -0.01351227, -0.01351227]),
-0.013639910534587798], dtype=object)
部分问题是np.array([d_f_a, d_f_b,d_f_w,d_f_g])
没有将那些对象串联到单个1D数组中,因为有些是numpy数组,有些是Python浮点数.通过使用np.hstack([d_f_a, d_f_b,d_f_w,d_f_g])
可以轻松解决该部分.
Part of the problem is that np.array([d_f_a, d_f_b,d_f_w,d_f_g])
is not concatenating those objects into a single 1D array since some are numpy arrays and some are Python floats. That part is easily solved by using np.hstack([d_f_a, d_f_b,d_f_w,d_f_g])
instead.
但是,这些对象的组合大小仍然只有10,而func_grad
的输出必须是22个长度的向量.您将需要再看看您的df_*
函数.特别地,W
是(3, 4)
数组,但是df_w
仅返回(4,)
向量,而gamma
是(4,)
向量,而df_gamma
仅返回标量.
However, the combined sizes of these objects is still only 10, whereas the output of func_grad
needs to be a 22-long vector. You will need to take another look at your df_*
functions. In particular, W
is a (3, 4)
array, but df_w
only returns a (4,)
vector, and gamma
is a (4,)
vector whereas df_gamma
only returns a scalar.
这篇关于发生python scipy.optimize.fmin_l_bfgs_b错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!