适合“多式联运"使用python对数正态分布到数据 [英] Fitting "multimodal" lognormal distributions to data using python

查看:49
本文介绍了适合“多式联运"使用python对数正态分布到数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在实验室中使用仪器测量了以下数据.由于该仪器会根据其直径将不同大小的颗粒收集在垃圾箱中,因此测量结果实际上是装箱"的:

I have the following data measured using an instrument in the lab. Since the instrument collects particles of different sizes in bins based upon their diameter the measurements are essentially "binned":

import numpy as np
import matplotlib.pylab as plt
from lmfit import models

y = np.array([196, 486, 968, 2262, 3321, 4203, 15072, 46789, 95201, 303494, 421484, 327507, 138931, 27973])
bins = np.array([0.0150, 0.0306, 0.0548, 0.0944, 0.1540, 0.2560, 0.3830, 0.6050, 0.9510, 1.6400, 2.4800, 3.6700, 5.3800, 9.9100, 15])

bin_width=np.diff(bins)
x_plot = np.add(bins[:-1],np.divide(bin_width,2))
x=x_plot
y=y

在绘制时,这里是数据的外观.以x标度为单位,有一个模式在0.1左右,另一种模式在2左右.

When plotted here is how the data look. There is one mode around 0.1 and another mode around 2 in the units of the x-scale.

在此研究领域中,通常将多峰"对数正态分布拟合到此类数据:鉴于此,我已使用LMFIT拟合了大约2的模式:

Within this research area it is common to fit "multimodal" lognormal distributions to such data: Given this I have fitted the mode around 2 using LMFIT:

model = models.LognormalModel()
params = model.make_params(center=1.5, sigma=0.6, amplitude=2214337)

result = model.fit(y, params, x=x)
print(result.fit_report())

plt.plot(x, y, label='data')
plt.plot(x, result.best_fit, label='fit')
plt.xscale("log")
plt.yscale("log")
plt.legend()
plt.show()

如预期的那样,这将非常适合第二模式在2附近.我的问题是,我还将如何着手在0.1附近适应第二模式(基本上这两种模式的总和应该适合数据)?我意识到也可以争论说三种模式会更好,但是我假设一旦我了解了如何使用两种模式,那么添加第三种就不那么容易了.

As expected this results in a good fit for the second mode around 2. My question is how would I also go about fitting a second mode around 0.1 (essentially a sum of the two modes should fit the data)? I realise it could also be argued that three modes would be better but I assume once I understand how to use two modes, the addition of a third should be trivial.

推荐答案

lmfit.Models 可以一起添加,就像:

lmfit.Models can be added together, as with:

model = (models.LognormalModel(prefix='p1_') +
         models.LognormalModel(prefix='p2_') +
         models.LognormalModel(prefix='p3_') )

params = model.make_params(p1_center=0.3, p1_sigma=0.2, p1_amplitude=1e4,
                           p2_center=1.0, p2_sigma=0.4, p2_amplitude=1e6,
                           p3_center=1.5, p3_sigma=0.6, p3_amplitude=2e7)

在复合模型中,模型的每个组件都有其自己的前缀"(任何字符串),该前缀带有参数名称.符合以下条件后,您可以获取有关Models组件的字典:

In a composite model, each component of the Model gets its own "prefix" (any string) that prepends the parameter names. You can get a dictionary of a Models components after the fit with:

components = result.eval_components()
# {'p1_': Array, 'p2_': Array, 'p3_': Array}
for compname, comp in components.keys():
    plt.plot(x, comp, label=compname)

对于拟合将在半对数或对数对数图中表示的数据,您可以考虑将模型拟合至 log(y).否则,对于 y 的非常低的值,拟合对失配不会非常敏感.

For fitting data that you would represent on a semi-log or log-log plot, you might consider fitting a model to log(y). Otherwise, the fit will not be very sensitive to misfit at very low values of y.

请注意, lmfit 模型和参数支持边界.您可能想要或发现需要放置边界,例如

Note that lmfit models and parameters support bounds. You may want to or find that you need to place bounds such as

params['p1_amplitude'].min = 0
params['p1_sigma'].min = 0
params['p1_center'].max = 0.5
params['p3_center'].min = 1.0

这篇关于适合“多式联运"使用python对数正态分布到数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆