不确定使用什么分布来建模我的数据 [英] Not sure what distribution to use to model my data

查看:101
本文介绍了不确定使用什么分布来建模我的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一组天文数据,我正在尝试拟合一条曲线:





我的合适代码为

  param = stats.norm.fit( df ['delta z']。dropna())#将正态分布拟合到数据
pdf_fitted = stats.norm.pdf(df ['delta z'],* param)
x = np。 linspace(* df ['delta z']。agg([min,max]),1000)#x值
binwidth = np.diff(edges..mean()
ax.plot( x,stats.norm.pdf(x,* param)* h.sum()* binwidth,color ='r')

产生





现在,我显然以错误的方式执行此操作,因为曲线根本不适合数据。我见过的所有教程,例如



那么我该如何绘制那些参数的拟合呢?

解决方案

在您声明的注释表明您不知道如何绘制曲线:这是一个拟合和绘制skewnorm的小示例。

 进口numpy为np 
进口scipy.stats为ss
进口matplotlib.pyplot为plt

data = ss。 expon.rvs(size = 1000)

P = ss.expon.fit(data)
rX = np.linspace(min(data),max(data),50)
rP = ss.skewnorm.pdf(rX,* P)

plt.hist(data,bins = 25,normed = True,color ='slategrey')

plt.plot(rX,rP,color ='darkturquoise')
plt.show()


I have a set of astronomical data, to which I'm trying to fit a curve:

My fitting code is

param = stats.norm.fit(df['delta z'].dropna())   # Fit a normal distribution to the data
pdf_fitted = stats.norm.pdf(df['delta z'], *param)
x = np.linspace(*df['delta z'].agg([min, max]), 1000) # x-values
binwidth = np.diff(edges).mean()
ax.plot(x, stats.norm.pdf(x, *param)*h.sum()*binwidth, color = 'r')

which produces

Now, I'm clearly doing this in the wrong way, because the curve doesn't fit the data at all. All of the tutorials I've seen, such as here involve making a set of data, in which case we already know things like the mean and the skew. This question led me to estimate the parameters with

a_estimate, loc_estimate, scale_estimate = stats.skewnorm.fit(df['delta z'])
ax.plot(x, skewnorm.pdf(x, a_estimate, loc_estimate, scale_estimate), 'r-', lw=5, alpha=0.6, label='skewnorm pdf')

which produces

so how can I plot the fit with those parameters?

解决方案

In the comments you state that you don't know how to plot the curve: here is a small example fitting and plotting skewnorm.

import numpy as np
import scipy.stats as ss
import matplotlib.pyplot as plt

data = ss. expon.rvs(size=1000)

P = ss.expon.fit(data)
rX = np.linspace(min(data), max(data), 50)
rP = ss.skewnorm.pdf(rX, *P)

plt.hist(data,bins=25, normed=True, color='slategrey')

plt.plot(rX, rP, color='darkturquoise')
plt.show()

这篇关于不确定使用什么分布来建模我的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆