python scipy/numpy中的多项式pmf [英] multinomial pmf in python scipy/numpy

查看:252
本文介绍了python scipy/numpy中的多项式pmf的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

scipy/numpy中是否有内置函数来获取多项式的PMF?我不确定binom是否以正确的方式进行概括,例如

Is there a built-in function in scipy/numpy for getting the PMF of a Multinomial? I'm not sure if binom generalizes in the correct way, e.g.

# Attempt to define multinomial with n = 10, p = [0.1, 0.1, 0.8]
rv = scipy.stats.binom(10, [0.1, 0.1, 0.8])
# Score the outcome 4, 4, 2
rv.pmf([4, 4, 2])

执行此操作的正确方法是什么?谢谢.

What is the correct way to do this? thanks.

推荐答案

我不知道内置函数,并且二项式概率也无法推广(您需要对另一组可能的结果进行归一化,因为所有计数的总和必须为n,独立的二项式将不会处理).但是,实现自己非常简单,例如:

There's no built-in function that I know of, and the binomial probabilities do not generalize (you need to normalise over a different set of possible outcomes, since the sum of all the counts must be n which won't be taken care of by independent binomials). However, it's fairly straightforward to implement yourself, for example:

import math

class Multinomial(object):
  def __init__(self, params):
    self._params = params

  def pmf(self, counts):
    if not(len(counts)==len(self._params)):
      raise ValueError("Dimensionality of count vector is incorrect")

    prob = 1.
    for i,c in enumerate(counts):
      prob *= self._params[i]**counts[i]

    return prob * math.exp(self._log_multinomial_coeff(counts))

  def log_pmf(self,counts):
    if not(len(counts)==len(self._params)):
      raise ValueError("Dimensionality of count vector is incorrect")

    prob = 0.
    for i,c in enumerate(counts):
      prob += counts[i]*math.log(self._params[i])

    return prob + self._log_multinomial_coeff(counts)

  def _log_multinomial_coeff(self, counts):
    return self._log_factorial(sum(counts)) - sum(self._log_factorial(c)
                                                    for c in counts)

  def _log_factorial(self, num):
    if not round(num)==num and num > 0:
      raise ValueError("Can only compute the factorial of positive ints")
    return sum(math.log(n) for n in range(1,num+1))

m = Multinomial([0.1, 0.1, 0.8])
print m.pmf([4,4,2])

>>2.016e-05

我对多项式系数的实现有些天真,并且在对数空间中工作以防止溢出.还应注意,n是多余的参数,因为它由计数的总和给出(并且相同的参数集适用于任何n).此外,由于中等n或较大维数会迅速下溢,因此最好在日志空间中工作(这里也提供了logPMF!)

My implementation of the multinomial coefficient is somewhat naive, and works in log space to prevent overflow. Also be aware that n is superfluous as a parameter, since it's given by the sum of the counts (and the same parameter set works for any n). Furthermore, since this will quickly underflow for moderate n or large dimensionality, you're better working in log space (logPMF provided here too!)

这篇关于python scipy/numpy中的多项式pmf的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆