在python中求解非线性方程 [英] Solving non-linear equations in python

查看:92
本文介绍了在python中求解非线性方程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有4个非线性方程,要求解三个未知数XYZ.方程的形式为:

I have 4 non-linear equations with three unknowns X, Y, and Z that I want to solve for. The equations are of the form:

F(m) = X^2 + a(m)Y^2 + b(m)XYcosZ + c(m)XYsinZ

...,其中abc是常数,取决于四个方程式中F的每个值.

...where a, b and c are constants which are dependent on each value of F in the four equations.

解决此问题的最佳方法是什么?

What is the best way to go about solving this?

推荐答案

有两种方法可以做到这一点.

There are two ways to do this.

  1. 使用非线性求解器
  2. 线性化问题并从最小二乘意义上解决问题

设置

因此,据我所知,您知道在4个不同点处的F,a,b和c,并且想要对模型参数X,Y和Z求逆.我们有3个未知数和4个观测数据要点,所以问题是过分确定的.因此,我们将在最小二乘意义上进行求解.

Setup

So, as I understand your question, you know F, a, b, and c at 4 different points, and you want to invert for the model parameters X, Y, and Z. We have 3 unknowns and 4 observed data points, so the problem is overdetermined. Therefore, we'll be solving in the least-squares sense.

在这种情况下,使用相反的术语更为常见,因此让我们翻转方程式.代替:

It's more common to use the opposite terminology in this case, so let's flip your equation around. Instead of:

F_i = X^2 + a_i Y^2 + b_i X Y cosZ + c_i X Y sinZ

我们写:

F_i = a^2 + X_i b^2 + Y_i a b cos(c) + Z_i a b sin(c)

我们在4个不同点(例如F_0, F_1, ... F_i)知道FXYZ的地方.

Where we know F, X, Y, and Z at 4 different points (e.g. F_0, F_1, ... F_i).

我们只是更改变量的名称,而不是方程式本身. (这比我想的要容易得多.)

We're just changing the names of the variables, not the equation itself. (This is more for my ease of thinking than anything else.)

实际上有可能使该方程线性化.您可以轻松解决a^2b^2a b cos(c)a b sin(c).为了使操作更简单,让我们再次重新标记标签:

It's actually possible to linearize this equation. You can easily solve for a^2, b^2, a b cos(c), and a b sin(c). To make this a bit easier, let's relabel things yet again:

d = a^2
e = b^2
f = a b cos(c)
g = a b sin(c)

现在,方程式变得简单得多:F_i = d + e X_i + f Y_i + g Z_i.对defg进行最小二乘线性反转很容易.然后,我们可以从以下位置获取abc:

Now the equation is a lot simpler: F_i = d + e X_i + f Y_i + g Z_i. It's easy to do a least-squares linear inversion for d, e, f, and g. We can then get a, b, and c from:

a = sqrt(d)
b = sqrt(e)
c = arctan(g/f)

好吧,让我们用矩阵形式来写.我们将翻译4个观测值(我们将编写的代码将采用任意数量的观测值,但现在让我们保持具体):

Okay, let's write this up in matrix form. We're going to translate 4 observations of (the code we'll write will take any number of observations, but let's keep it concrete at the moment):

F_i = d + e X_i + f Y_i + g Z_i

进入:

|F_0|   |1, X_0, Y_0, Z_0|   |d|
|F_1| = |1, X_1, Y_1, Z_1| * |e|
|F_2|   |1, X_2, Y_2, Z_2|   |f|
|F_3|   |1, X_3, Y_3, Z_3|   |g|

或:F = G * m(我是地球物理学家,因此我们将G用作格林函数",将m用作模型参数".通常,我们将d用作数据",而不是F.)

Or: F = G * m (I'm a geophysist, so we use G for "Green's Functions" and m for "Model Parameters". Usually we'd use d for "data" instead of F, as well.)

在python中,这将转换为:

In python, this would translate to:

def invert(f, x, y, z):
    G = np.vstack([np.ones_like(x), x, y, z]).T
    m, _, _, _ = np.linalg.lstsq(G, f)

    d, e, f, g = m
    a = np.sqrt(d)
    b = np.sqrt(e)
    c = np.arctan2(g, f) # Note that `c` will be in radians, not degrees
    return a, b, c

非线性解决方案

您也可以使用scipy.optimize解决此问题,如@Joe所建议. scipy.optimize中最易访问的函数是scipy.optimize.curve_fit,默认情况下使用Levenberg-Marquardt方法.

Non-linear Solution

You could also solve this using scipy.optimize, as @Joe suggested. The most accessible function in scipy.optimize is scipy.optimize.curve_fit which uses a Levenberg-Marquardt method by default.

Levenberg-Marquardt是一种爬坡"算法(在这种情况下,它会走下坡路,但是无论如何都使用该术语).从某种意义上说,您可以初步估计模型参数(所有参数,默认情况下为scipy.optimize),并沿参数空间中的observed - predicted斜率向下倾斜至底部.

Levenberg-Marquardt is a "hill climbing" algorithm (well, it goes downhill, in this case, but the term is used anyway). In a sense, you make an initial guess of the model parameters (all ones, by default in scipy.optimize) and follow the slope of observed - predicted in your parameter space downhill to the bottom.

注意事项:选择正确的非线性反演方法,初步猜测并调整方法的参数非常黑暗".您只能通过这样做来学习它,并且在许多情况下,事情将无法正常运行.如果您的参数空间相当平滑(应该是这样),那么Levenberg-Marquardt是一个很好的通用方法.在其他情况下,还有很多其他方法(包括遗传算法,神经网络等,除了更常见的方法(如模拟退火)以外).我在这里不打算深入探讨这一部分.

Caveat: Picking the right non-linear inversion method, initial guess, and tuning the parameters of the method is very much a "dark art". You only learn it by doing it, and there are a lot of situations where things won't work properly. Levenberg-Marquardt is a good general method if your parameter space is fairly smooth (this one should be). There are a lot of others (including genetic algorithms, neural nets, etc in addition to more common methods like simulated annealing) that are better in other situations. I'm not going to delve into that part here.

有一个常见的陷阱,一些优化工具包试图纠正scipy.optimize不能解决的问题.如果您的模型参数具有不同的大小(例如a=1, b=1000, c=1e-8),则需要重新缩放比例,以使它们的大小相似.否则,scipy.optimize的爬山"算法(如LM)将无法准确计算局部梯度的估计值,并且会给出非常不准确的结果.现在,我假设abc具有相对相似的幅度.另外,请注意,基本上所有非线性方法都需要您进行初步猜测,并且对此猜测很敏感.我将其保留在下面(将其作为p0 kwarg传递给curve_fit),因为默认的a, b, c = 1, 1, 1a, b, c = 3, 2, 1的相当准确的猜测.

There is one common gotcha that some optimization toolkits try to correct for that scipy.optimize doesn't try to handle. If your model parameters have different magnitudes (e.g. a=1, b=1000, c=1e-8), you'll need to rescale things so that they're similar in magnitude. Otherwise scipy.optimize's "hill climbing" algorithms (like LM) won't accurately calculate the estimate the local gradient, and will give wildly inaccurate results. For now, I'm assuming that a, b, and c have relatively similar magnitudes. Also, be aware that essentially all non-linear methods require you to make an initial guess, and are sensitive to that guess. I'm leaving it out below (just pass it in as the p0 kwarg to curve_fit) because the default a, b, c = 1, 1, 1 is a fairly accurate guess for a, b, c = 3, 2, 1.

在不加警告的情况下,curve_fit希望传递给函数,进行观测的一组点(作为单个ndim x npoints数组)和观测值.

With the caveats out of the way, curve_fit expects to be passed a function, a set of points where the observations were made (as a single ndim x npoints array), and the observed values.

因此,如果我们这样编写函数:

So, if we write the function like this:

def func(x, y, z, a, b, c):
    f = (a**2
         + x * b**2
         + y * a * b * np.cos(c)
         + z * a * b * np.sin(c))
    return f

在将其传递给curve_fit之前,我们需要对其进行包装以接受稍有不同的参数.

We'll need to wrap it to accept slightly different arguments before passing it to curve_fit.

简而言之:

def nonlinear_invert(f, x, y, z):
    def wrapped_func(observation_points, a, b, c):
        x, y, z = observation_points
        return func(x, y, z, a, b, c)

    xdata = np.vstack([x, y, z])
    model, cov = opt.curve_fit(wrapped_func, xdata, f)
    return model

两种方法的独立示例:

为了给您一个完整的实现,下面是一个示例

Stand-alone Example of the two methods:

To give you a full implementation, here's an example that

  1. 生成随机分布的点以评估函数,
  2. 使用设置的模型参数评估这些点上的功能,
  3. 在结果中添加噪音
  4. 然后使用上述线性方法和非线性方法对模型参数进行求逆.


import numpy as np
import scipy.optimize as opt

def main():
    nobservations = 4
    a, b, c = 3.0, 2.0, 1.0
    f, x, y, z = generate_data(nobservations, a, b, c)

    print 'Linear results (should be {}, {}, {}):'.format(a, b, c)
    print linear_invert(f, x, y, z)

    print 'Non-linear results (should be {}, {}, {}):'.format(a, b, c)
    print nonlinear_invert(f, x, y, z)

def generate_data(nobservations, a, b, c, noise_level=0.01):
    x, y, z = np.random.random((3, nobservations))
    noise = noise_level * np.random.normal(0, noise_level, nobservations)
    f = func(x, y, z, a, b, c) + noise
    return f, x, y, z

def func(x, y, z, a, b, c):
    f = (a**2
         + x * b**2
         + y * a * b * np.cos(c)
         + z * a * b * np.sin(c))
    return f

def linear_invert(f, x, y, z):
    G = np.vstack([np.ones_like(x), x, y, z]).T
    m, _, _, _ = np.linalg.lstsq(G, f)

    d, e, f, g = m
    a = np.sqrt(d)
    b = np.sqrt(e)
    c = np.arctan2(g, f) # Note that `c` will be in radians, not degrees
    return a, b, c

def nonlinear_invert(f, x, y, z):
    # "curve_fit" expects the function to take a slightly different form...
    def wrapped_func(observation_points, a, b, c):
        x, y, z = observation_points
        return func(x, y, z, a, b, c)

    xdata = np.vstack([x, y, z])
    model, cov = opt.curve_fit(wrapped_func, xdata, f)
    return model

main()

这篇关于在python中求解非线性方程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆