如何用Scipy拟合对数正态分布? [英] How to fit a log-normal distribution with Scipy?

查看:757
本文介绍了如何用Scipy拟合对数正态分布?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将对数正态参数 mu sigma 拟合到现有的(测量的)对数正态



测得的对数正态分布由以下 x y定义数组:

  x:
4.870000000000000760e-09
5.620000000000000859e- 09
6.490000000000000543e-09
7.500000000000000984e-09
8.660000000000001114e-09
1.000000000000000021e-08
1.155000000000000085e-08
1.334000000000000067e-08
1.540000000000000224e-08
1.778000000000000105e-08
2.054000000000000062e-08
2.371000000000000188e-08
2.738000000000000099e-08
3.162000000000000124e-08
3.652000000000000541e-08
4.217000000000000637e-08
4.870000000000000595e-08
5.623000000000000125e-08
6.493999999999999784e-08
7.498999999999999850e-08
8.659999999999999460 e-08
1.000000000000000087e-07
1.154800000000000123e-07
1.333500000000000 129e-07
1.539900000000000177e-07
1.778300000000000247e-07
2.053499999999999958e-07
2.371399999999999913e-07
2.738399999999999692e-07
3.162300000000000199e- 07
3.651700000000000333e-07
4.217000000000000240e-07
4.869700000000000784e-07
8.659600000000001124e-07
1.000000000000000167e-06


y:
1.883186407957446899e + 11
3.609524622222222290e + 11
7.508596384507042236e + 11
2.226776878843930664e + 12
4.845941940346821289e + 12
7.979258430057803711e +12
1.101088735028901758e + 13
1.346205871213872852e + 13
1.509035024739884375e + 13
1.599175638381502930e + 13
1.668097844161849805e + 13
1.786208191445086719e + 13
2.007139089017341016e + 13
2.346096336416185156e + 13
2.763042850867051953e + 13
3.17772657803468203131e + 13
3.552045143352600781e + 13
3.858765218497110156e + 13
4.051697248554913281e + 13
4.132681209248554688e + 13
4.112713068208092188e + 13
4.00387 1248554913281e + 13
3.797625966473988281e + 13
3.472541513294797656e + 13
3.017757826589595312e + 13
2.454670317919075000e + 13
1.840085110982658984e + 13
1.250047161156069336e + 13
7.540309609248554688e + 12
3.912091102658959473e + 12
1.632974141040462402e + 12
4.585002890867052002e + 11
1.260128910303030243e + 11
7.276263267445255280e + 09 $ 11 b $ b 1.120399584203921509e + 10

绘制如下所示:





现在使用 scipy.stats.lognorm。 fit 像这样:

 形状,位置,比例= stats.lognorm.fit(y,floc = 0)
mu = np.log(scale)
sigma =形状

y_fit = 1 / x * 1 /(sigma * np.sqrt(2 * np.pi ))* np.exp(-(np.log(x)-mu)** 2 /(2 * sigma ** 2))

结果 y_适合看起来像这样:

  2.774453764650559735e-92 
9.215468156399056736e-92
3.066511893903929907e-91
1.0223358843255575e-90
3.371353425505715432e-90
1.107869289600567113e-89
3.632923945686527959e-89
1.186352074527947499e-88
3.843439346384186221e-88
1.241282395050092616e-87
4.012158206798217088e-87
1.283531486148302474e-86
4.102813367932395623e-86
1.306865297124819703e-85
4.149188517768147925e -85
1.309743071360157226e-84
4.121819150664498056e-84
1.289935574540856462e-83
4.028475776631639341e-83
1.251854680594688466e-82
3.876254948575364474e-82
1.194751160823721531e-81
3.669411018320463915e-81
1.122061051084741563e-80
3.418224619543735425e-80
1.037398725542414359e-79
3.134554301786779178e-79
9.436770981828214504e-79
2.828745744939237710e-78
8.447588129217592353e-78
2.512030904806250195195-77
7.4422224614825 58402e-77
2.195666296758331429e-76
1.598228276801569301e-74
4.622033883255558750e-74

显然与原始的 y 值相距很远。我确实意识到我根本没有使用初始的 x 值。因此,我认为我需要以某种方式转移(也许还可以缩放)所得的分布。



但是,我无法全神贯注地执行此操作。如何正确适应Python中的对数正态分布?

解决方案

它与 curve_fit (如果您缩放数据)。不过,我不确定缩放和重新缩放是否有意义。 (



无论如何,看起来功能并不是真的。


I want to fit the log-normal parameters mu and sigma to an existing (measured) log-normal distribution.

The measured log-normal distribution is defined by the following x and y arrays:

x:
4.870000000000000760e-09
5.620000000000000859e-09
6.490000000000000543e-09
7.500000000000000984e-09
8.660000000000001114e-09
1.000000000000000021e-08
1.155000000000000085e-08
1.334000000000000067e-08
1.540000000000000224e-08
1.778000000000000105e-08
2.054000000000000062e-08
2.371000000000000188e-08
2.738000000000000099e-08
3.162000000000000124e-08
3.652000000000000541e-08
4.217000000000000637e-08
4.870000000000000595e-08
5.623000000000000125e-08
6.493999999999999784e-08
7.498999999999999850e-08
8.659999999999999460e-08
1.000000000000000087e-07
1.154800000000000123e-07
1.333500000000000129e-07
1.539900000000000177e-07
1.778300000000000247e-07
2.053499999999999958e-07
2.371399999999999913e-07
2.738399999999999692e-07
3.162300000000000199e-07
3.651700000000000333e-07
4.217000000000000240e-07
4.869700000000000784e-07
8.659600000000001124e-07
1.000000000000000167e-06


y:
1.883186407957446899e+11
3.609524622222222290e+11
7.508596384507042236e+11
2.226776878843930664e+12
4.845941940346821289e+12
7.979258430057803711e+12
1.101088735028901758e+13
1.346205871213872852e+13
1.509035024739884375e+13
1.599175638381502930e+13
1.668097844161849805e+13
1.786208191445086719e+13
2.007139089017341016e+13
2.346096336416185156e+13
2.763042850867051953e+13
3.177726578034682031e+13
3.552045143352600781e+13
3.858765218497110156e+13
4.051697248554913281e+13
4.132681209248554688e+13
4.112713068208092188e+13
4.003871248554913281e+13
3.797625966473988281e+13
3.472541513294797656e+13
3.017757826589595312e+13
2.454670317919075000e+13
1.840085110982658984e+13
1.250047161156069336e+13
7.540309609248554688e+12
3.912091102658959473e+12
1.632974141040462402e+12
4.585002890867052002e+11
1.260128910303030243e+11
7.276263267445255280e+09
1.120399584203921509e+10

Plotted this looks like this:

When I now use scipy.stats.lognorm.fit like this:

shape, loc, scale = stats.lognorm.fit(y, floc=0)
mu = np.log(scale)
sigma = shape

y_fit = 1 / x * 1 / (sigma * np.sqrt(2*np.pi)) * np.exp(-(np.log(x)-mu)**2/(2*sigma**2))

The resulting y_fit looks like this:

2.774453764650559735e-92
9.215468156399056736e-92
3.066511893903929907e-91
1.022335884325557513e-90
3.371353425505715432e-90
1.107869289600567113e-89
3.632923945686527959e-89
1.186352074527947499e-88
3.843439346384186221e-88
1.241282395050092616e-87
4.012158206798217088e-87
1.283531486148302474e-86
4.102813367932395623e-86
1.306865297124819703e-85
4.149188517768147925e-85
1.309743071360157226e-84
4.121819150664498056e-84
1.289935574540856462e-83
4.028475776631639341e-83
1.251854680594688466e-82
3.876254948575364474e-82
1.194751160823721531e-81
3.669411018320463915e-81
1.122061051084741563e-80
3.418224619543735425e-80
1.037398725542414359e-79
3.134554301786779178e-79
9.436770981828214504e-79
2.828745744939237710e-78
8.447588129217592353e-78
2.512030904806250195e-77
7.442222461482558402e-77
2.195666296758331429e-76
1.598228276801569301e-74
4.622033883255558750e-74

And is obliviously very far away from the original y values. I do realize that I haven't used the initial x values at all. So I assume I need to shift (and maybe also scale) the resulting distribution somehow.

However I can't wrap my head around how I need to do this. How do I correctly fit a log-normal distribution in Python?

解决方案

It works out of the box with curve_fit if you scale the data. I am not sure if scaling and re-scaling makes sense, though. (this seems to confirm the ansatz)

import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit

def log_fit( x, a, mu, sigma ):
    return a / x * 1. / (sigma * np.sqrt( 2. * np.pi ) ) * np.exp( -( np.log( x ) - mu )**2 / ( 2. * sigma**2 ) )

pp = np.argmax( y )

yM = y[ pp ]
xM = x[ pp ]

xR = x/xM
yR = y/yM
print xM, yM
sol, err = curve_fit( log_fit, xR, yR )
print sol
scaledSol = [ yM * sol[0] * xM , sol[1] + np.log(xM), sol[2] ]
print scaledSol
yF = np.fromiter( ( log_fit( xx, *sol ) for xx in xR ), np.float )
yFIR = np.fromiter( (  log_fit( xx, *scaledSol ) for xx in x ), np.float )

fig = plt.figure()
ax = fig.add_subplot( 2,1, 1)
bx = fig.add_subplot( 2,1, 2)
ax.plot( x, y )
ax.plot( x, yFIR )
bx.plot( xR, yR )
bx.plot( xR, yF )
plt.show()

Providing

>> 7.499e-08 41326812092485.55
>> [2.93003525 0.68436895 0.87481153]
>> [9080465.32138486, -15.72154211628693, 0.8748115349982701]

and

Anyhow, does not really look like that's the fit function.

这篇关于如何用Scipy拟合对数正态分布?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆