拟合自定义Scipy分布 [英] Fitting a Custom Scipy Distribution

查看:93
本文介绍了拟合自定义Scipy分布的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用自定义scipy类重新定义了对数正态分布.我已经模拟了这种分布,并且尝试恢复指定的原始参数,但是fit方法返回不同的参数.

I have redefined the lognormal distribution using custom scipy class. I have simulated this distribution and I am trying to recover the original parameters I have specified, however, the fit method is returning different parameters.

import numpy as np
import pandas as pd
from scipy.stats import rv_continuous
from scipy.special import erf
from scipy.special import erfinv

class lognorm_v2(rv_continuous):

    def _pdf(self, x, mu, sigma):
        return 1 / (x * sigma * np.sqrt(2 * np.pi)) * np.exp(-0.5 * ((np.log(x) - mu)/sigma)**2)

    def _cdf(self, x, mu, sigma):
        return 0.5 + 0.5 * erf((np.log(x) - mu)/ (np.sqrt(2)*sigma))
    
    def _sf(self, x, mu, sigma):
        u = (x)**b/(1+x**b)
        return 1 - 0.5 + 0.5 * erf((np.log(x) - mu)/ (np.sqrt(2)*sigma))
    
    def _ppf(self,x, mu, sigma):
        return np.exp(sigma * erfinv(2*x - 1) - mu)
    
    def _argcheck(self, mu, sigma):
        s = sigma > 0
        return s

np.random.seed(seed=111)
logn = lognorm_v2(name='lognorm_v2',a=0,b=np.inf)
test = logn.rvs(mu=2,sigma=1,loc=0,scale=1,size=100000)

logn.fit(test)
logn.fit(test,floc=0,fscale=1)

当位置和比例不固定时,我获取参数:

When loc and scale are not fixed I obtain the parameters:

(0.9216388162274325,0.7061876689651909,-0.0003659266464081178,0.05399544825451739)

(0.9216388162274325, 0.7061876689651909, -0.0003659266464081178, 0.05399544825451739)

固定后,结果为:

(-2.0007136838780917,0.7086144279779958,0,1)

(-2.0007136838780917, 0.7086144279779958, 0, 1)

为什么我不能提取原始模拟中指定的mu 2和sigma 1?我知道我将无法获得确切的值,但是对于100K模拟来说,它们应该非常接近.我的numpy版本是1.19.2,scipy是1.5.2.谢谢!

Why am I not able to extract the mu 2 and sigma 1 specified in the original simulation? I understand I will not get the exact values, but they should be very close for 100K simulations. My numpy is version 1.19.2 and scipy is 1.5.2. Thank you!

推荐答案

我已使用正确的_ppf更正了代码,似乎可以为mu和sigma产生适当的匹配度

I've corrected code with proper _ppf, and it seems to produce proper fits for mu and sigma

代码,Python 3.9 Windows 10 x64

Code, Python 3.9 Windows 10 x64

import numpy as np
from scipy.stats import rv_continuous
from scipy.special import erf
from scipy.special import erfinv

SQRT2 = np.float64(1.4142135623730951)

class lognorm_v2(rv_continuous):

    def _pdf(self, x, μ, σ):
        return 1 / (x * σ * SQRT2 * np.sqrt(np.pi)) * np.exp(-0.5 * ((np.log(x) - μ)/σ)**2)

    def _cdf(self, x, μ, σ):
        return 0.5 + 0.5 * erf((np.log(x) - μ)/ (SQRT2*σ))

    def _ppf(self, x, μ, σ):
        return np.exp(μ + σ * SQRT2 * erfinv(2.0*x - 1.0))

    def _argcheck(self, μ, σ):
        s = σ > 0.0
        return s

np.random.seed(seed=111)
logn = lognorm_v2(name='lognorm_v2', a=0.0, b=np.inf)
test = logn.rvs(μ=2.0,σ=1.0,loc=0.0,scale=1.0, size=100000)

logn.fit(test,floc=0,fscale=1)

打印出

(1.9990788106319746, 1.0021523463000124, 0, 1)

这篇关于拟合自定义Scipy分布的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆