使用`scipy.optimize.minimize`使具有大梯度的函数最小化 [英] Minimizing functions with large gradients using `scipy.optimize.minimize`

查看:258
本文介绍了使用`scipy.optimize.minimize`使具有大梯度的函数最小化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在高维空间中优化标量函数.该函数随参数变化而快速变化,以使(的绝对值)梯度很大. scipy.optimize.minimize中的优化器失败,因为最小化过程采取了太大的步骤.以下代码使用简单的二次函数说明了该问题.

I need to optimize a scalar function in a high-dimensional space. The function varies quickly with changing arguments such that the (absolute value of) gradients are large. The optimizers in scipy.optimize.minimize fail because the minimization procedure takes steps that are too large. The following code illustrates the problem using a simple quadratic function.

from scipy.optimize import minimize

def objective(x, scalar=1):
    """
    Quadratic objective function with optional scalar.
    """
    # Report function call for debugging
    print "objective({}, scalar={})".format(x, scalar)
    # Return function value and gradient
    return x ** 2 * scalar, 2 * x * scalar

# This optimisation succeeds
print minimize(objective, 1, jac=True)
# This optimisation fails
print minimize(objective, 1, (1e8, ), jac=True)

当然,我可以手动重新调整目标函数的值和梯度,但是我想知道是否存在建议的方法来最小化此类函数,例如指定学习率.

Of course, I can manually rescale the value and gradient of the function of interest, but I was wondering whether there is a recommended approach for minimizing such functions, e.g. specifying a learning rate.

推荐答案

对于大型非线性优化问题,通常人们会注意(至少)四件事:

For large non-linear optimization problems typically one would pay attention to (at least) four things:

  1. 缩放
  2. 初始值
  3. 界限
  4. 精确的梯度,以及如果可能的话,二阶导数(用于复杂问题) 使用可以自动区分的建模系统)
  1. Scaling
  2. Initial values
  3. Bounds
  4. Precise gradients and if possible second derivatives (for complex problems use a modeling system that allows automatic differentiation)

一些更高级的求解器可能会为自动缩放提供一些支持.但是,缩放非线性问题并不容易,因为雅可比矩阵会发生变化(通常可用的一些策略是:仅对线性部分进行缩放,在开始时根据初始值对线性+非线性部分进行一次缩放,或者在执行过程中对问题进行重新缩放迭代过程).在这方面,线性求解器的工作较为轻松(Jacobian不变,因此我们一开始可以扩展一次). Scipy.optimize.minimize并不是最高级的,所以我鼓励您自己调整大小(通常您只能在求解器启动之前执行一次;在某些情况下,您甚至可以停止求解器进行重新缩放,然后使用最后一个调用该求解器点作为初始值-听起来很疯狂,但这招帮助了我几次).一个好的初始点和一个好的边界也可以在这方面有所帮助(以便将求解器保持在可以可靠地评估函数和梯度的合理区域内).最后,有时模型重新制定可以帮助提供更好的缩放比例(通过乘法替换除法,获取对数等).

Some more advanced solvers may provide some support for automatic scaling. However scaling for non-linear problems is not that easy as the Jacobian will change (some strategies that are typically available are: scale only the linear part, scale linear + nonlinear part once at the beginning based on initial values, or rescale the problem during the iteration process). Linear solvers have an easier job in this respect (Jacobian is constant so we can scale once at the beginning). Scipy.optimize.minimize is not the most advanced so I would encourage you to scale things yourself (typically you can only do this once before the solver starts; in some cases you may even stop the solver to rescale and then call the solver again using the last point as initial value -- this sounds crazy but this trick helped me a few times). A good initial point and good bounds can also help in this respect (in order to keep the solver in reasonable regions where functions and gradients can be reliably evaluated). Finally sometimes model reformulations can help in providing better scaling (replace division by multiplication, taking logs etc).

这篇关于使用`scipy.optimize.minimize`使具有大梯度的函数最小化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆