负二项式在科学中的替代参数化 [英] alternative parametrization of the negative binomial in scipy
问题描述
在scipy中,负二项式分布定义为:
In scipy the negative binomial distribution is defined as:
nbinom.pmf(k) = choose(k+n-1, n-1) * p**n * (1-p)**k
此是常见的定义,另请参阅维基百科:
https://en.wikipedia.org/wiki/ Negative_binomial_distribution
This is the common definition, see also wikipedia: https://en.wikipedia.org/wiki/Negative_binomial_distribution
但是,存在另一种参数化,其中负二项式由均值 mu
定义
However, there exists a different parametrization where the negative Binomial is defined by the mean mu
and the dispersion parameter.
在R中这很容易,因为可以通过两个参数化来定义negbin:
In R this is easy, as the negbin can be defined by both parametrizations:
dnbinom(x, size, prob, mu, log = FALSE)
如何在scipy中使用均值/分散参数化?
How can I use the mean/dispersion parametrization in scipy ?
编辑:
直接从R帮助:
大小为n且概率为p的负二项式分布具有密度
The negative binomial distribution with size = n and prob = p has density
Γ(x + n)/(Γ(n)x!)p ^ n(1-p)^ x
Γ(x+n)/(Γ(n) x!) p^n (1-p)^x
另一种参数化(通常在生态学中使用)是平均mu(请参见上文),以及大小,色散参数,其中概率=大小/(大小+亩)。在此参数化中,方差为mu + mu ^ 2 / size。
An alternative parametrization (often used in ecology) is by the mean mu (see above), and size, the dispersion parameter, where prob = size/(size+mu). The variance is mu + mu^2/size in this parametrization.
在此也进行了详细说明:
It is also describe here in more detail:
https://en.wikipedia.org/wiki/Negative_binomial_distribution#Alternative_formulations
推荐答案
from scipy.stats import nbinom
def convert_params(mu, theta):
"""
Convert mean/dispersion parameterization of a negative binomial to the ones scipy supports
See https://en.wikipedia.org/wiki/Negative_binomial_distribution#Alternative_formulations
"""
r = theta
var = mu + 1 / r * mu ** 2
p = (var - mu) / var
return r, 1 - p
def pmf(counts, mu, theta):
"""
>>> import numpy as np
>>> from scipy.stats import poisson
>>> np.isclose(pmf(10, 10, 10000), poisson.pmf(10, 10), atol=1e-3)
True
"""
return nbinom.pmf(counts, *convert_params(mu, theta))
def logpmf(counts, mu, theta):
return nbinom.logpmf(counts, *convert_params(mu, theta))
def cdf(counts, mu, theta):
return nbinom.cdf(counts, *convert_params(mu, theta))
def sf(counts, mu, theta):
return nbinom.sf(counts, *convert_params(mu, theta))
这篇关于负二项式在科学中的替代参数化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!