使用scipy.optimize最小化多元可微函数 [英] minimizing a multivariate, differentiable function using scipy.optimize

查看:404
本文介绍了使用scipy.optimize最小化多元可微函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用scipy.optimize最小化以下功能:

I'm trying to minimize the following function with scipy.optimize:

它的渐变是这样的:

(对于感兴趣的人来说,这是用于成对比较的Bradley-Terry-Luce模型的似然函数.与logistic回归紧密相关.)

(for those who are interested, this is the likelihood function of a Bradley-Terry-Luce model for pairwise comparisons. Very closely linked to logistic regression.)

很显然,向所有参数添加常量不会更改函数的值.因此,我让\ theta_1 =0.这是目标函数和python中的渐变的实现(theta在此处变为x):

It is fairly clear that adding a constant to all the parameters does not change the value of the function. Hence, I let \theta_1 = 0. Here are the implementation the objective functions and the gradient in python (theta becomes x here):

def objective(x):
    x = np.insert(x, 0, 0.0)
    tiles = np.tile(x, (len(x), 1))
    combs = tiles.T - tiles
    exps = np.dstack((zeros, combs))
    return np.sum(cijs * scipy.misc.logsumexp(exps, axis=2))

def gradient(x):
    zeros = np.zeros(cijs.shape)
    x = np.insert(x, 0, 0.0)
    tiles = np.tile(x, (len(x), 1))
    combs = tiles - tiles.T
    one = 1.0 / (np.exp(combs) + 1)
    two = 1.0 / (np.exp(combs.T) + 1)
    mat = (cijs * one) + (cijs.T * two)
    grad = np.sum(mat, axis=0)
    return grad[1:]  # Don't return the first element

以下是cijs可能的示例:

[[ 0  5  1  4  6]
 [ 4  0  2  2  0]
 [ 6  4  0  9  3]
 [ 6  8  3  0  5]
 [10  7 11  4  0]]

这是我运行以执行最小化的代码:

This is the code I run to perform the minimization:

x0 = numpy.random.random(nb_items - 1)
# Let's try one algorithm...
xopt1 = scipy.optimize.fmin_bfgs(objective, x0, fprime=gradient, disp=True)
# And another one...
xopt2 = scipy.optimize.fmin_cg(objective, x0, fprime=gradient, disp=True)

但是,它总是在第一次迭代中失败:

However, it always fails in the first iteration:

Warning: Desired error not necessarily achieved due to precision loss.
         Current function value: 73.290610
         Iterations: 0
         Function evaluations: 38
         Gradient evaluations: 27

我不知道为什么它失败了.由于此行而显示错误: https://github.com/scipy/scipy/blob /master/scipy/optimize/optimize.py#L853

I can't figure out why it fails. The error gets displayed because of this line: https://github.com/scipy/scipy/blob/master/scipy/optimize/optimize.py#L853

因此,这种狼线搜索"似乎没有成功,但是我不知道如何从此处继续进行.任何帮助,我们都感激不尽!

So this "Wolfe line search" does not seem to succeed, but I have no idea how to proceed from here... Any help is appreciated!

推荐答案

为@pv.作为评论指出,我在计算梯度时犯了一个错误.首先,我的目标函数的梯度的正确(数学)表达式是:

As @pv. pointed out as a comment, I made a mistake in computing the gradient. First of all, the correct (mathematical) expression for the gradient of my objective function is:

(请注意减号.)此外,我的Python实现是完全错误的,除了符号错误之外.这是我更新的渐变:

(notice the minus sign.) Furthermore, my Python implementation was completely wrong, beyond the sign mistake. Here's my updated gradient:

def gradient(x):
    nb_comparisons = cijs + cijs.T
    x = np.insert(x, 0, 0.0)
    tiles = np.tile(x, (len(x), 1))
    combs = tiles - tiles.T
    probs = 1.0 / (np.exp(combs) + 1)
    mat = (nb_comparisons * probs) - cijs
    grad = np.sum(mat, axis=1)
    return grad[1:]  # Don't return the first element.

要调试它,我使用了:

  • scipy.optimize.check_grad:表明我的梯度函数所产生的结果与近似(有限差)梯度相差甚远.
  • scipy.optimize.approx_fprime可以大致了解这些值.
  • 一些经过手工挑选的简单示例,可以在需要时进行手工分析,还有一些Wolfram Alpha要求进行健全性检查.
  • scipy.optimize.check_grad: showed that my gradient function was producing results very far away from an approximated (finite difference) gradient.
  • scipy.optimize.approx_fprime to get an idea of the values should look like.
  • a few hand-picked simple examples that could be analyzed by hand if needed, and a few Wolfram Alpha queries for sanity-checking.

这篇关于使用scipy.optimize最小化多元可微函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆