在Python中拟合合并的对数正态数据 [英] Fitting binned lognormal data in Python

查看:215
本文介绍了在Python中拟合合并的对数正态数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一系列按体积分数百分比排列的粒度分布数据,例如:;

 尺寸%
6.68 0.05
9.92 1.15
等。

我需要适应数据到对数正态分布,我计划使用python的 stats.lognorm.fit 函数进行此操作,但这似乎期望输入是变量数组而不是装箱数据,从我已经


I have a range of particle size distribution data arranged by percentage volume fraction, like so:;

size   %
6.68   0.05
9.92   1.15
etc.

I need to fit this data to a lognormal distribution, which I planned to do using python's stats.lognorm.fit function, but this seems to expect the input as an array of variates rather than binned data, judging by what I've read.

I was planning to use a for loop to iterate through the data and .extend each size entry to a placeholder array the required number of times to create an array with a list of variates that corresponds to the binned data.

This seems really ugly and inefficient though, and the kind of thing that there's probably an easy way to do. Is there a way to input binned data into the stats.lognorm.fit function?

解决方案

I guess one possible workaround is to manually fit a pdf to your bin data, assuming x values are the midpoint of each interval, and y values are the corresponding bin frequency. And then fit a curve based on x and y values using scipy.optimize.curve_fit. I think accuracy of the results will depend the number of bins you have. An example is shown below:

import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np

def pdf(x, mu, sigma):
    """pdf of lognormal distribution"""

    return (np.exp(-(np.log(x) - mu)**2 / (2 * sigma**2)) / (x * sigma * np.sqrt(2 * np.pi)))

mu, sigma = 3., 1.                              # actual parameter value

data = np.random.lognormal(mu, sigma, size=1000)       # data generation
h = plt.hist(data, bins=30, normed = True)

y = h[0]                                        # frequencies for each bin, this is y value to fit 
xs = h[1]                                       # boundaries for each bin
delta = xs[1] - xs[0]                           # width of bins
x = xs[:-1] + delta /                           # midpoints of bins, this is x value to fit

popt, pcov = curve_fit(pdf, x, y, p0=[1, 1])    # data fitting, popt contains the fitted parameters
print(popt)
# [ 3.13048122  1.01360758]                       fitting results

fig, ax = plt.subplots()
ax.hist(data, bins=30, normed=True, align='mid', label='Histogram')
xr = np.linspace(min(xs), max(xs), 10000)
yr = pdf(xr, mu, sigma)
yf = pdf(xr, *popt)
ax.plot(xr, yr, label="Actual")
ax.plot(xr, yf, linestyle = 'dashed', label="Fitted")
ax.legend()

这篇关于在Python中拟合合并的对数正态数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆