我知道scipy curve_fit可以做得更好 [英] I know scipy curve_fit can do better

查看:107
本文介绍了我知道scipy curve_fit可以做得更好的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用python / numpy / scipy来实现此算法,以根据地形和坡度对齐两个数字高程模型(DEM):



Co-卫星高度数据集的定位和偏差校正以量化冰川厚度变化,C。Nuth和A.Kääb,。



感谢您的所有答复,对于放弃此问题,我深表歉意。

解决方案

如果您只是想获取具有相位偏移的正弦波,则不需要非线性拟合。



您可以替换 a * sin(x-b)+ c a * sin(x)+ b * cos(x)+ c ,因为任何具有偏移量的正弦都可以写成正弦和余弦的适当组合(相量加法,例如在傅立叶变换中)。



如果给出相同的结果,那么问题就不在于非线性拟合。


I'm using python/numpy/scipy to implement this algorithm for aligning two digital elevation models (DEMs) based on terrain aspect and slope:

"Co-registration and bias corrections of satellite elevation data sets for quantifying glacier thickness change", C. Nuth and A. Kääb, doi:10.5194/tc-5-271-2011

I have things a framework set up, but the quality of the fit provided by scipy.optimize.curve_fit is poor.

def f(x, a, b, c):
    y = a * numpy.cos(numpy.deg2rad(b-x)) + c
    return y

def compute_offset(dh, slope, aspect):
    import scipy.optimize as optimization

    idx = random.sample(range(dh.compressed().size), 10000)
    xdata = numpy.array(aspect.compressed()[idx], float)
    ydata = numpy.array((dh/numpy.tan(numpy.deg2rad(slope))).compressed()[idx], float)

    #Generate synthetic data to test curve_fit
    #xdata = numpy.arange(0,360,0.01)
    #ydata = f(xdata, 20.0, 130.0, -3.0) + 20*numpy.random.normal(size=len(xdata))

    print xdata
    print ydata

    x0 = numpy.array([0.0, 0.0, 0.0])

    fit = optimization.curve_fit(f, xdata, ydata, x0)[0]
    #optimization.leastsq(f, x0[:], args=(xdata, ydata))
    genplot(xdata, ydata, fit)
    return fit

def genplot(x, y, fit):
    a = (numpy.arange(0,360))
    f_a = f(a, fit[0], fit[1], fit[2])
    idx = random.sample(range(x.size), 10000)
    plt.figure()
    plt.xlabel('Aspect (deg)')
    plt.ylabel('dh/tan(slope) (m)')
    plt.plot(x[idx], y[idx], 'r.')
    plt.axhline(color='k')
    plt.plot(a, f_a, 'b')
    plt.ylim(-80,80)
    plt.show()

#Input DEMs
dem1_fn = sys.argv[1]
dem2_fn = sys.argv[2]

dem1_ds = gdal.Open(dem1_fn, gdal.GA_ReadOnly)
dem2_ds = gdal.Open(dem2_fn, gdal.GA_ReadOnly)

#Extract band 1 from each dataset as masked array using internal nodata value
dem1 = getperc_new.gdal_getma(dem1_ds, 1)
dem2 = getperc_new.gdal_getma(dem2_ds, 1)

#Produce slope and aspect maps using gdaldem and load into masked arrays
dem1_slope = gdaldem_slope(dem1_fn)
dem1_aspect = gdaldem_aspect(dem1_fn)

#Compute common mask and apply to all products
common_mask = dem1.mask + dem2.mask + dem1_slope.mask + dem1_aspect.mask

diff_euler = numpy.ma.array(dem2-dem1, mask=common_mask)
dem1_slope.__setmask__(common_mask)
dem1_aspect.__setmask__(common_mask)

#Compute relationship between elevation difference, slope and aspect
fit = compute_offset(diff_euler, dem1_slope, dem1_aspect)
print fit

Here is the fit for my data, which initially consists of ~2 million points, but I've randomly sampled for testing/plotting purposes:

[ -14.9639559 216.01093596 -41.96806735]

There is plenty of data there for a good fit, but the result from curve_fit is poor. When I run with synthetic data, I get a nice fit:

original input parameters [20.0, 130.0, -3.0]

result from curve_fit [-19.66719631 -49.6673076 -3.12198723]

Not sure if this has something to do with using masked arrays, a limitation of curve_fit, or if I'm just overlooking something simple. Thanks for any suggestions.

==========================

Edit 9/4/13 16:30 PDT

As suggested by @Evert and others, the problem was definitely related to outliers. I was able to obtain a much better fit after removing outliers. Looking at my old code, it seems I computed the median absolute deviation for each aspect, then removed anything outside of 2*mad before fitting.

I generated a few additional plots back in November 2012:

But looking at these again, I'm almost positive they were generated for different input data. It's all that I can find right now, so I'm including them here as an example of a case with biased sampling. This method for DEM alignment is bound to fail for cases like these - and it has nothing to do with scipy's curve fitting abilities.

I ended up developing a different approach for alignment involving normalized cross-correlation, sub-pixel refinement, and vertical offset removal for two masked 2D numpy arrays. It is faster and consistently provides better results. Although even that approach has been superseded by an Iterative Closest Point (ICP) tool (pc_align) developed by Oleg Alexandrov as part of the NASA Ames Stereo Pipeline.

Thanks for all of your responses and I apologize for abandoning this question.

解决方案

If you're just trying to get a sine wave with phase offset, you don't need a non-linear fit.

You can replace that a * sin(x - b) + c by a * sin(x) + b * cos(x) + c, because any sine with an offset can be written as an appropriate combination of a sine and a cosine("Phasor addition", like in a fourier transform).

If that gives the same result then it is not the "non-linear" fit that is the problem.

这篇关于我知道scipy curve_fit可以做得更好的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆