使用3d数据和参数在Scipy中进行曲线拟合 [英] Curve fitting in Scipy with 3d data and parameters

查看:144
本文介绍了使用3d数据和参数在Scipy中进行曲线拟合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力在scipy中安装3d分布函数。我有一个在x-bin和y-bin中具有计数的numpy数组,并且我正在尝试使其适应于相当复杂的3-d分布函数。数据适合26个(!)参数,这些参数描述了其两个组成总体的形状。

I am working on fitting a 3d distribution function in scipy. I have a numpy array with counts in x- and y-bins, and I am trying to fit that to a rather complicated 3-d distribution function. The data is fit to 26 (!) parameters, which describe the shape of its two constituent populations.

我在这里了解到,我必须通过x-和y-当我调用minimumsq时,坐标为 args。 unutbu呈现的代码是为我编写的,但是当我尝试将其应用于我的特定情况时,出现错误 TypeError:minimumsq()获得关键字参数'args'的多个值

I learned here that I have to pass my x- and y-coordinates as 'args' when I call leastsq. The code presented by unutbu works as written for me, but when I try to apply it to my specific case, I am given the error "TypeError: leastsq() got multiple values for keyword argument 'args' "

这是我的代码(对不起长度):

Here's my code (sorry for the length):

import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as spopt
from textwrap import wrap
import collections

cl = 0.5
ch = 3.5
rl = -23.5
rh = -18.5
mbins = 10
cbins = 10

def hist_data(mixed_data, mbins, cbins):
    import numpy as np
    H, xedges, yedges = np.histogram2d(mixed_data[:,1], mixed_data[:,2], bins = (mbins, cbins), weights = mixed_data[:,3])
    x, y = 0.5 * (xedges[:-1] + xedges[1:]), 0.5 * (yedges[:-1] + yedges[1:])
    return H.T, x, y

def gauss(x, s, mu, a):
    import numpy as np
    return a * np.exp(-((x - mu)**2. / (2. * s**2.)))

def tanhlin(x, p0, p1, q0, q1, q2):
    import numpy as np
    return p0 + p1 * (x + 20.) + q0 * np.tanh((x - q1)/q2)

def func3d(p, x, y):
    import numpy as np
    from sys import exit
    rsp0, rsp1, rsq0, rsq1, rsq2, rmp0, rmp1, rmq0, rmq1, rmq2, rs, rm, ra, bsp0, bsp1, bsq0, bsq1, bsq2, bmp0, bmp1, bmq0, bmq1, bmq2, bs, bm, ba = p
x, y = np.meshgrid(coords[0], coords[1])
    rs = tanhlin(x, rsp0, rsp1, rsq0, rsq1, rsq2)
    rm = tanhlin(x, rmp0, rmp1, rmq0, rmq1, rmq2)
    ra = schechter(x, rap, raa, ram) # unused
    bs = tanhlin(x, bsp0, bsp1, bsq0, bsq1, bsq2)
    bm = tanhlin(x, bmp0, bmp1, bmq0, bmq1, bmq2)
    ba = schechter(x, bap, baa, bam) # unused
    red_dist = ra / (rs * np.sqrt(2 * np.pi)) * gauss(y, rs, rm, ra)
    blue_dist = ba / (bs * np.sqrt(2 * np.pi)) * gauss(y, bs, bm, ba)
    result = red_dist + blue_dist
return result

def residual(p, coords, data):
    import numpy as np
    model = func3d(p, coords)
    res = (model.flatten() - data.flatten())
    # can put parameter restrictions in here
    return res

def poiss_err(data):
    import numpy as np
    return np.where(np.sqrt(H) > 0., np.sqrt(H), 2.)

# =====

H, x, y = hist_data(mixed_data, mbins, cbins)

data = H

coords = x, y
# x and y will be the projected coordinates of the data H onto the plane z = 0

# x has bins of width 0.5, with centers at -23.25, -22.75, ... , -19.25, -18.75
# y has bins of width 0.3, with centers at 0.65, 0.95, ... , 3.05, 3.35    

Param = collections.namedtuple('Param', 'rsp0 rsp1 rsq0 rsq1 rsq2 rmp0 rmp1 rmq0 rmq1 rmq2 rs rm ra bsp0 bsp1 bsq0 bsq1 bsq2 bmp0 bmp1 bmq0 bmq1 bmq2 bs bm ba')
p_guess = Param(rsp0 = 0.152, rsp1 = 0.008, rsq0 = 0.044, rsq1 = -19.91, rsq2 = 0.94, rmp0 = 2.279, rmp1 = -0.037, rmq0 = -0.108, rmq1 = -19.81, rmq2 = 0.96, rs = 1., rm = -20.5, ra = 10000., bsp0 = 0.298, bsp1 = 0.014, bsq0 = -0.067, bsq1 = -19.90, bsq2 = 0.58, bmp0 = 1.790, bmp1 = -0.053, bmq0 = -0.363, bmq1 = -20.75, bmq2 = 1.12, bs = 1., bm = -20., ba = 2000.)

opt, cov, infodict, mesg, ier = spopt.leastsq(residual, p_guess, poiss_err(H), args = coords, maxfev = 100000, full_output = True)

这是我的数据,只带有较少的bin:

Here's my data, just with fewer bins:

[[  1.00000000e+01   1.10000000e+01   2.10000000e+01   1.90000000e+01
1.70000000e+01   2.10000000e+01   2.40000000e+01   1.90000000e+01
2.80000000e+01   1.90000000e+01]
[  1.40000000e+01   4.50000000e+01   6.00000000e+01   6.80000000e+01
1.34000000e+02   1.97000000e+02   2.23000000e+02   2.90000000e+02
3.23000000e+02   3.03000000e+02]
[  3.00000000e+01   1.17000000e+02   3.78000000e+02   9.74000000e+02
1.71900000e+03   2.27700000e+03   2.39000000e+03   2.25500000e+03
1.85600000e+03   1.31000000e+03]
[  1.52000000e+02   9.32000000e+02   2.89000000e+03   5.23800000e+03
6.66200000e+03   6.19100000e+03   4.54900000e+03   3.14600000e+03
2.09000000e+03   1.33800000e+03]
[  5.39000000e+02   2.58100000e+03   6.51300000e+03   8.89900000e+03
8.52900000e+03   6.22900000e+03   3.55000000e+03   2.14300000e+03
1.19000000e+03   6.92000000e+02]
[  1.49600000e+03   4.49200000e+03   8.77200000e+03   1.07610000e+04
9.76700000e+03   7.04900000e+03   4.23200000e+03   2.47200000e+03
1.41500000e+03   7.02000000e+02]
[  2.31800000e+03   7.01500000e+03   1.28870000e+04   1.50840000e+04
1.35590000e+04   8.55600000e+03   4.15600000e+03   1.77100000e+03
6.57000000e+02   2.55000000e+02]
[  1.57500000e+03   3.79300000e+03   5.20900000e+03   4.77800000e+03
3.26600000e+03   1.44700000e+03   5.31000000e+02   1.85000000e+02
9.30000000e+01   4.90000000e+01]
[  7.01000000e+02   1.21600000e+03   1.17600000e+03   7.93000000e+02
4.79000000e+02   2.02000000e+02   8.80000000e+01   3.90000000e+01
2.30000000e+01   1.90000000e+01]
[  2.93000000e+02   3.93000000e+02   2.90000000e+02   1.97000000e+02
1.18000000e+02   6.40000000e+01   4.10000000e+01   1.20000000e+01
1.10000000e+01   4.00000000e+00]]

非常感谢!

推荐答案

所以 leastsq 的作用是尝试:


最小化一组方程的平方和。
-秘密文档

一组函数,因此,如果您查看参数此处,因此您可以根据需要进行操作并传递残差函数,但是,仅使用 curve_fit 会为您完成:)并创建必要的方程式

as it says it's minimizing a set of functions and therefore doesn't actually take any x or y data inputs in the easiest manner if you look at the arguments here so you can do it as you like and pass a residual function however, it's significantly easier to just use curve_fit which does it for you :) and creates the necessary equations

要拟合,您应该使用: curve_fit (如果可以)他们使用的通用残差实际上是您自己传递的函数 res = minimumsq(func,p0,args = args,full_output = 1,** kw) 在此处编码。

For fitting you should use: curve_fit if you are ok with the generic residual they use which is actually the function you pass itself res = leastsq(func, p0, args=args, full_output=1, **kw) if you look in the code here.

例如如果我在2d中拟合了rosenbrock函数,并猜测了y参数:

e.g. If I fit the rosenbrock function in 2d and guess the y-parameter:

from scipy.optimize import curve_fit
from itertools import imap
import numpy as np
# use only an even number of arguments
def rosen2d(x,a):
    return (1-x)**2 + 100*(a - (x**2))**2
#generate some random data slightly off

datax = np.array([.01*x for x in range(-10,10)])
datay = 2.3
dataz = np.array(map(lambda x: rosen2d(x,datay), datax))
optimalparams, covmatrix = curve_fit(rosen2d, datax, dataz)
print 'opt:',optimalparams

在4d中拟合colville函数:

fitting the colville function in 4d:

from scipy.optimize import curve_fit
import numpy as np

# 4 dimensional colville function
# definition from http://www.sfu.ca/~ssurjano/colville.html
def colville(x,x3,x4):
    x1,x2 = x[:,0],x[:,1]
    return 100*(x1**2 - x2)**2 + (x1-1)**2 + (x3-1)**2 + \
            90*(x3**2 - x4)**2 + \
            10.1*((x2 - 1)**2 + (x4 - 1)**2) + \
            19.8*(x2 - 1)*(x4 - 1)
#generate some random data slightly off

datax = np.array([[x,x] for x in range(-10,10)])
#add gaussian noise
datax+= np.random.rand(*datax.shape)
#set 2 of the 4 parameters to constants
x3 = 3.5
x4 = 4.5
#calculate the function
dataz = colville(datax, x3, x4)
#fit the function
optimalparams, covmatrix = curve_fit(colville, datax, dataz)
print 'opt:',optimalparams

使用自定义残差函数:

from scipy.optimize import leastsq
import numpy as np

# 4 dimensional colville function
# definition from http://www.sfu.ca/~ssurjano/colville.html
def colville(x,x3,x4):
    x1,x2 = x[:,0],x[:,1]
    return 100*(x1**2 - x2)**2 + (x1-1)**2 + (x3-1)**2 + \
            90*(x3**2 - x4)**2 + \
            10.1*((x2 - 1)**2 + (x4 - 1)**2) + \
            19.8*(x2 - 1)*(x4 - 1)
#generate some random data slightly off


datax = np.array([[x,x] for x in range(-10,10)])
#add gaussian noise
datax+= np.random.rand(*datax.shape)
#set 2 of the 4 parameters to constants
x3 = 3.5
x4 = 4.5

def residual(p, x, y):
    return y - colville(x,*p)
#calculate the function
dataz = colville(datax, x3, x4)
#guess some initial parameter values
p0 = [0,0]
#calculate a minimization of the residual
optimalparams = leastsq(residual, p0, args=(datax, dataz))[0]
print 'opt:',optimalparams

编辑:您在 args 中同时使用了位置和关键字arg:如果您查看 docs 您会看到它ses位置3,但也可以用作关键字参数。您使用了 both ,这意味着该函数与预期的一样,令人困惑。

you used both the position and the keyword arg for args: if you look at the docs you'll see it uses position 3, but also can be used as a keyword argument. You used both which means the function is as expected, confused.

这篇关于使用3d数据和参数在Scipy中进行曲线拟合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆