Python - 最小化卡方 [英] Python - Minimizing Chi-squared

查看:86
本文介绍了Python - 最小化卡方的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试通过最小化卡方来将线性模型拟合到一组应力/应变数据中.不幸的是,使用下面的代码没有正确地最小化 chisqfunc 函数.它正在寻找初始条件下的最小值,x0,这是不正确的.我浏览了 scipy.optimize 文档并测试了最小化其他正常工作的功能.您能否建议如何修复下面的代码,或者建议我可以使用另一种方法通过最小化卡方来拟合数据的线性模型?

I have been trying to fit a linear model to a set of stress/strain data by minimizing chi-squared. Unfortunately using the code below is not correctly minimizing the chisqfunc function. It is finding the minimum at the initial conditions, x0, which is not correct. I have looked through the scipy.optimize documentation and tested minimizing other functions which has worked correctly. Could you please suggest how to fix the code below or suggest another method I can use to fit a linear model to data by minimizing chi-squared?

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result

感谢您阅读我的问题,任何帮助将不胜感激.

Thank you for reading my question and any help would be greatly appreciated.

干杯,威尔

我目前使用的数据集:数据链接

推荐答案

问题在于您最初的猜测与实际解决方案相去甚远.如果您在 chisqfunc() 中添加一个打印语句,例如 print (a,b),然后重新运行您的代码,您将得到如下结果:

The problem is that your initial guess is very far from the actual solution. If you add a print statement inside chisqfunc() like print (a,b), and rerun your code, you'll get something like:

(0, 0)
(1.4901161193847656e-08, 0.0)
(0.0, 1.4901161193847656e-08)

这意味着 minimize 仅在这些点评估函数.

This means that minimize evaluates the function only at these points.

如果您现在尝试评估这 3 对值的 chisqfunc(),您将看到它们完全匹配,例如

if you now try to evaluate chisqfunc() at these 3 pairs of values, you'll see that they EXACTLY match, for example

print chisqfunc((0,0))==chisqfunc((1.4901161193847656e-08,0))
True

这是因为四舍五入浮点运算.换句话说,在评估stress - model 时,var stressmodel 大太多数量级,结果被截断.

This happens because of rounding floating points arithmetics. In other words, when evaluating stress - model, the var stress is too many order of magnitude larger than model, and the result is truncated.

然后可以尝试对其进行暴力破解,提高浮点精度,在使用 loadtxt 加载数据后立即写入 data=data.astype(np.float128).minimize 失败,带有 result.success=False,但有一个有用的消息

One could then just try bruteforcing it, increasing floating point precision, with writing data=data.astype(np.float128) just after loading the data with loadtxt. minimize fails, with result.success=False, but with a helpful message

由于精度损失,不一定能达到预期的误差.

Desired error not necessarily achieved due to precision loss.

一种可能性是提供更好的初始猜测,以便在减法 stress - modelmodel 部分具有相同的数量级,另一个是重新调整数据,使解更接近您的初始猜测 (0,0).

One possibility is then to provide a better initial guess, so that in the subtraction stress - model the model part is of the same order of magnitude, the other to rescale the data, so that the solution will be closer to your initial guess (0,0).

如果您只是重新调整数据,则要好得多,例如,相对于某个应力值(例如这种材料的膨胀/开裂)进行无量纲化

It is MUCH better if you just rescale the data, making for example nondimensional with respect to a certain stress value (like the yelding/cracking of this material)

这是一个拟合示例,使用最大测量应力作为应力标度.您的代码几乎没有变化:

This is an example of the fitting, using as a stress scale the maximum measured stress. There are very few changes from your code:

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]


smax = stress.max()
stress = stress/smax
#I am assuming the errors err_stress are in the same units of stress.
err_stress = err_stress/smax

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result
assert result.success==True
a,b=result.x*smax
plot(strain,stress*smax)
plot(strain,a+b*strain)

你的线性模型非常好,即你的材料在这个变形范围内具有非常线性的行为(它到底是什么材料?):

Your linear model is quite good, i.e. your material has a very linear behaviour for this range of deformation (what material is it anyway?):

这篇关于Python - 最小化卡方的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆