负二项式在科学中的替代参数化 [英] alternative parametrization of the negative binomial in scipy

查看:82
本文介绍了负二项式在科学中的替代参数化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在scipy中,负二项式分布定义为:

In scipy the negative binomial distribution is defined as:

nbinom.pmf(k) = choose(k+n-1, n-1) * p**n * (1-p)**k

此是常见的定义,另请参阅维基百科:
https://en.wikipedia.org/wiki/ Negative_binomial_distribution

This is the common definition, see also wikipedia: https://en.wikipedia.org/wiki/Negative_binomial_distribution

但是,存在另一种参数化,其中负二项式由均值 mu 定义

However, there exists a different parametrization where the negative Binomial is defined by the mean mu and the dispersion parameter.

在R中这很容易,因为可以通过两个参数化来定义negbin:

In R this is easy, as the negbin can be defined by both parametrizations:

dnbinom(x, size, prob, mu, log = FALSE)

如何在scipy中使用均值/分散参数化?

How can I use the mean/dispersion parametrization in scipy ?

编辑:

直接从R帮助:

大小为n且概率为p的负二项式分布具有密度

The negative binomial distribution with size = n and prob = p has density

Γ(x + n)/(Γ(n)x!)p ^ n(1-p)^ x

Γ(x+n)/(Γ(n) x!) p^n (1-p)^x

另一种参数化(通常在生态学中使用)是平均mu(请参见上文),以及大小,色散参数,其中概率=大小/(大小+亩)。在此参数化中,方差为mu + mu ^ 2 / size。

An alternative parametrization (often used in ecology) is by the mean mu (see above), and size, the dispersion parameter, where prob = size/(size+mu). The variance is mu + mu^2/size in this parametrization.

在此也进行了详细说明:

It is also describe here in more detail:

https://en.wikipedia.org/wiki/Negative_binomial_distribution#Alternative_formulations

推荐答案

from scipy.stats import nbinom


def convert_params(mu, theta):
    """
    Convert mean/dispersion parameterization of a negative binomial to the ones scipy supports

    See https://en.wikipedia.org/wiki/Negative_binomial_distribution#Alternative_formulations
    """
    r = theta
    var = mu + 1 / r * mu ** 2
    p = (var - mu) / var
    return r, 1 - p


def pmf(counts, mu, theta):
    """
    >>> import numpy as np
    >>> from scipy.stats import poisson
    >>> np.isclose(pmf(10, 10, 10000), poisson.pmf(10, 10), atol=1e-3)
    True
    """
    return nbinom.pmf(counts, *convert_params(mu, theta))


def logpmf(counts, mu, theta):
    return nbinom.logpmf(counts, *convert_params(mu, theta))


def cdf(counts, mu, theta):
    return nbinom.cdf(counts, *convert_params(mu, theta))


def sf(counts, mu, theta):
    return nbinom.sf(counts, *convert_params(mu, theta))

这篇关于负二项式在科学中的替代参数化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆