如何使用 scipy.stats.norm.cdf 和 matplotlib 获得 sigmodal CDF 曲线? [英] How to get a sigmodal CDF curve use scipy.stats.norm.cdf and matplotlib?
问题描述
我正在尝试绘制正态分布的 S 形累积分布函数 (cdf) 曲线.但是,我最终得到了均匀分布.我做错了什么?
测试脚本
将 numpy 导入为 np从 numpy.random 导入 default_rng从 scipy.stats 导入规范导入 matplotlib.pyplot 作为 plt尺寸 = 1000rg = default_rng(12345)a = rg.random(size=siz)rg = default_rng(12345)b = norm.rvs(size=siz, random_state=rg)c = 范数.cdf(b)打印('a = ',a)打印('b = ',b)打印('c = ',c)图, ax = plt.subplots(3, 1)acount, abins, aignored = ax[0].hist( a, bins=20, histtype='bar', label='a', color='C0' )bcount, bbins, bignored = ax[1].hist( b, bins=20, histtype='bar', label='b', color='C1' )ccount, cbins, cignored = ax[2].hist( c, bins=20, histtype='bar', label='c', color='C2' )print('acount, abins, aignored = ', acount, abins, aignored)打印('bcount,bbins,bignored =',bcount,bbins,bignored)打印('ccount,cbins,cignored =',ccount,cbins,cignored)ax[0].legend()ax[1].legend()ax[2].legend()plt.show()
为了绘制正态分布随机变量的 CDF 的 sigmoidal 结果,我不应该使用 matplotlib 的 hist()
函数.相反,我可以使用 bar()
函数来绘制我的结果.
I am trying to plot the S-shape cumulative distribution function (cdf) curve of a normal distribution. However, I ended up with a uniform distribution. What am I doing wrong?
Test Script
import numpy as np
from numpy.random import default_rng
from scipy.stats import norm
import matplotlib.pyplot as plt
siz = 1000
rg = default_rng( 12345 )
a = rg.random(size=siz)
rg = default_rng( 12345 )
b = norm.rvs(size=siz, random_state=rg)
c = norm.cdf(b)
print( 'a = ', a)
print( 'b = ', b)
print( 'c = ', c)
fig, ax = plt.subplots(3, 1)
acount, abins, aignored = ax[0].hist( a, bins=20, histtype='bar', label='a', color='C0' )
bcount, bbins, bignored = ax[1].hist( b, bins=20, histtype='bar', label='b', color='C1' )
ccount, cbins, cignored = ax[2].hist( c, bins=20, histtype='bar', label='c', color='C2' )
print( 'acount, abins, aignored = ', acount, abins, aignored)
print( 'bcount, bbins, bignored = ', bcount, bbins, bignored)
print( 'ccount, cbins, cignored = ', ccount, cbins, cignored)
ax[0].legend()
ax[1].legend()
ax[2].legend()
plt.show()
To plot the sigmoidal result of the CDF of the normally distributed random variates, I should not have used matplotlib's hist()
function. Rather, I could have used the bar()
function to plot my results.
@Laaggan and @dumbPy answer stated that using regularised and ordered x value is the way to derive the sigmoidal cdf curve. Though commonly done, it isn't applicable when random variates are used. I have compared the solutions of the approach that they had mentioned with what I have done to show that both approaches give the same result. However, my results (see below figure) do show that the usual approach of getting the cdf values goes yield more occurrences of the extreme values of a normal distribution than by using random variates. Excluding the two extremes, occurrences appear uniformly distributed.
I have revised my script and provided comments to demonstrate how I compared the two approaches. I hope my answer can benefit others who are learning to use the rvs()
, pdf()
, and cdf()
functions of the scipy.stats.norm
class.
import numpy as np
from numpy.random import default_rng
from scipy.stats import norm
import matplotlib.pyplot as plt
mu = 0
sigma = 1
samples = 1000
rg = default_rng( 12345 )
a = rg.random(size=samples) #Get a uniform distribution of numbers in the range of 0 to 1.
print( 'a = ', a)
# Get pdf and cdf values using normal random variates.
rg = default_rng( 12345 ) #Recreate Bit Generator to ensure a same starting point
b_pdf = norm.rvs( loc=mu, scale=sigma, size=samples, random_state=rg ) #Get pdf of normal distribution(mu=0, sigma=1 gives -3.26 to +3.26).
b_cdf = norm.cdf( b_pdf, loc=mu, scale=sigma ) #get cdf of normal distribution using pdf values (always gives between 0 to 1).
print( 'b_pdf = ', b_pdf)
print( 'b_cdf = ', b_cdf)
#To check b is normally distributed. Using the ordered x (commonly practiced):
c_x = np.linspace( mu - 3.26*sigma, mu + 3.26*sigma, samples )
c_pdf = norm.pdf( c_x, loc=mu, scale=sigma )
c_cdf = norm.cdf( c_x, loc=mu, scale=sigma )
print( 'c_x = ', c_x )
print( 'c_pdf = ', c_pdf )
print( 'c_cdf = ', c_cdf )
fig, ax = plt.subplots(3, 1)
bins=np.linspace( 0, 1, num=10 )
acount, abins, aignored = ax[0].hist( a, bins=50, histtype='bar', label='a', color='C0', alpha=0.2, density=True )
bcount, bbins, bignored = ax[0].hist( b_cdf, bins=50, histtype='bar', label='b_cdf', color='C1', alpha=0.2, density=True )
ccount, cbins, cignored = ax[0].hist( c_cdf, bins=50, histtype='bar', label='c_cdf', color='C2', alpha=0.2, density=True )
bcount, bbins, bignored = ax[1].hist( b_pdf, bins=20, histtype='bar', label='b_pdf', color='C1', alpha=0.4, density=True )
cpdf_line = ax[1].plot(c_x, c_pdf, label='c_pdf', color='C2')
bpdf_bar = ax[2].bar( b_pdf, b_cdf, label='b_cdf', color='C1', alpha=0.4, width=0.01)
ccdf_line = ax[2].plot(c_x, c_cdf, label='c_cdf', color='C2')
print( 'acount, abins, aignored = ', acount, abins, aignored)
print( 'bcount, bbins, bignored = ', bcount, bbins, bignored)
print( 'ccount, cbins, cignored = ', ccount, cbins, cignored)
ax[0].legend(loc='upper left')
ax[1].legend(loc='upper left')
ax[2].legend(loc='upper left')
plt.show()
这篇关于如何使用 scipy.stats.norm.cdf 和 matplotlib 获得 sigmodal CDF 曲线?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!