将数据拟合到概率分布,也许偏斜正态? [英] Fitting data to a probability distribution, maybe skew normal?

查看:93
本文介绍了将数据拟合到概率分布,也许偏斜正态?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将我的数据拟合成某种概率分布,这样我就可以根据该分布生成随机数.下面是数据点的样子,x 轴在数据值后面,y 轴是概率.

数据图

它们看起来适合偏斜正态分布,平均值约为 10^-4.该图的数据实际上是从原始数据集中分箱的.我尝试使用 scipy.stats 库来适应原始数据的偏斜法线,但这种拟合根本不起作用.

我想知道是否有人知道将其放入任何 PDF 的方法?我的图中的数据如下(无法发布原始原始数据,因为它太大了):

<预> <代码> X = [2.0030289496413441e-07,6.021220996561269e-07,1.8100138940039783e-06,5.4410065638820868e-06,1.6355980761406714e-05,4.916702516834233e-05,0.00014779892439152631,0.00044429212417263257,0.0013355678494582283,0.0040147942838919017,0.012068704071088232,0.036279223206999923,0.10905744550124194,0.32783299552460016,0.98548496584223111,2.9624248661943691,8.9052206700550585,26.769608940074498,80.470994415019419,241.90046842440222,727.16681394735679,2185.9055451626773,6570.9586311220974,19752.682098944373]Y(或P(x)的图中)= [2.2554525565554728e-05,2.2554525565554728e-05,3.1576335791776624e-05,0.0013140978842667934,0.00029833486088983759,0.00083417571068968434,0.0013023224717182351,0.00030292744905932074,0.00018784462533064236,0.00015960011900197359,5.231239486282394e-05,4.8227744123750205e-05,3.8972462681781172e-05,2.9372389964277703e-05,3.3001942979800356e-05,2.8061790992628833e-05,2.6056781088158009e-05,2.522638138246609e-05,2.4144908778509908e-05,2.5086756895368843e-05,2.3095834179128078e-05,2.2554525565554745e-05, 2.2554525565554755e-05, 2.2554525565554728e-05]

解决方案

您可以使用 scipy.stats.skewnorm.fit(请参阅文档 此处) 将数据拟合为偏态正态分布.

skewnorm.fit 返回数据中形状、位置和尺度参数的最大似然估计 (MLE).

from scipy import stats# 在这里定义你的数据集# 让我们制作一个带有预定义参数的示例来演示它是如何工作的a, loc, scale = 1.6, -0.2, 3.2数据 = stats.skewnorm(a, loc, scale).rvs(10000)# 估计样本的参数a_estimate, loc_estimate, scale_estimate = stats.skewnorm.fit(data)打印(a_estimate,loc_estimate,scale_estimate)

输出:

<块引用>

1.5784198343540448 -0.18066366859003175 3.1817350641737274

I am trying to fit my data to some kind of a probability distribution, so I can then generate random numbers based on the distribution. Below is what the data points look like, with x-axis behind the data values and y-axis the probabilities.

Data plot

They look like they would fit to a skew normal distribution, with mean around 10^-4. The plot's data is actually binned from an original data set. I tried using scipy.stats library to fit to a skew normal on the original data, but the fit does not work at all.

I was wondering if anyone knows a way to fit this to any PDF? The data in my plot is below (can't post the original, raw data as it is far too large):

x = [2.0030289496413441e-07, 6.021220996561269e-07, 1.8100138940039783e-06, 5.4410065638820868e-06, 1.6355980761406714e-05, 4.916702516834233e-05, 0.00014779892439152631, 0.00044429212417263257, 0.0013355678494582283, 0.0040147942838919017, 0.012068704071088232, 0.036279223206999923, 0.10905744550124194, 0.32783299552460016, 0.98548496584223111, 2.9624248661943691, 8.9052206700550585, 26.769608940074498, 80.470994415019419, 241.90046842440222, 727.16681394735679, 2185.9055451626773, 6570.9586311220974, 19752.682098944373]

y (or P(x) in the diagram) = [2.2554525565554728e-05, 2.2554525565554728e-05, 3.1576335791776624e-05, 0.0013140978842667934, 0.00029833486088983759, 0.00083417571068968434, 0.0013023224717182351, 0.00030292744905932074, 0.00018784462533064236, 0.00015960011900197359, 5.231239486282394e-05, 4.8227744123750205e-05, 3.8972462681781172e-05, 2.9372389964277703e-05, 3.3001942979800356e-05, 2.8061790992628833e-05, 2.6056781088158009e-05, 2.522638138246609e-05, 2.4144908778509908e-05, 2.5086756895368843e-05, 2.3095834179128078e-05, 2.2554525565554745e-05, 2.2554525565554755e-05, 2.2554525565554728e-05]

解决方案

You can use scipy.stats.skewnorm.fit (see the docs here) to fit the data into a skew-normal distribution.

skewnorm.fit returns maximum likelihood estimate (MLE) for shape, location, and scale parameters from data.

from scipy import stats

# define your dataset here

# let's make a sample with pre-defined parameters to demonstrate how it works
a, loc, scale = 1.6, -0.2, 3.2
data = stats.skewnorm(a, loc, scale).rvs(10000)

# estimate parameters of the sample
a_estimate, loc_estimate, scale_estimate = stats.skewnorm.fit(data)
print(a_estimate, loc_estimate, scale_estimate)

Output:

1.5784198343540448 -0.18066366859003175 3.1817350641737274

这篇关于将数据拟合到概率分布,也许偏斜正态?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆