Python scipy rv_continuous 实现的问题 [英] Issues with Python scipy rv_continuous implementation

查看:51
本文介绍了Python scipy rv_continuous 实现的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用自定义分布创建 rv_continuous 的子类,我可以通过多个函数为其计算 pdf.

I'm trying to create a subclass of rv_continuous with a custom distribution for which I can calculate the pdf through a number of functions.

这是我到目前为止所做的

Here's what I've done so far

import numpy as np
from scipy.stats import rv_continuous

辅助功能

def func1(xx, a_, b_, rho, m, sigma):
    return a_ + b_*(rho*(xx-m) + np.sqrt((xx-m)*(xx-m) + sigma*sigma))

def func2(xx, a_, b_, rho, m, sigma):
    sig2 = sigma*sigma
    return b_*(rho*np.sqrt((xx-m)*(xx-m)+sig2)+xx-m)/(np.sqrt((xx-m)*(xx-m)+sig2))

def func3(xx, a_, b_, rho, m, sigma):
    sig2 = sigma*sigma
    return b_*sig2/(np.sqrt((xx-m)*(xx-m)+sig2)*((xx-m)*(xx-m)+sig2))

def func4(xx, a_, b_, rho, m, sigma):
    w = func1(xx, a_, b_, rho, m, sigma)
    w1 = func2(xx, a_, b_, rho, m, sigma)
    w2 = func3(xx, a_, b_, rho, m, sigma)
    return (1.-0.5*xx*w1/w)*(1.0-0.5*xx*w1/w) - 0.25*w1*w1*(0.25 + 1./w) + 0.5*w2

def func5(xx, a_, b_, rho, m, sigma):
    vsqrt = np.sqrt(func1(xx, a_, b_, rho, m, sigma))
    return -xx/vsqrt - 0.5*vsqrt

密度函数最终

def density(xx, a_, b_, rho, m, sigma):
    dm = func5(xx, a_, b_, rho, m, sigma)
    return func4(xx, a_, b_, rho, m, sigma)*np.exp(-0.5*dm*dm)/np.sqrt(2.*np.pi*func1(xx, a_, b_, rho, m, sigma))

一组参数

Params = 1.0073, 0.3401026, -0.8, 0.000830, 0.5109564

从函数检查pdf

xmin, xmax, nbPoints = -10., 10., 2000
x_real = np.linspace(xmin, xmax, nbPoints)

den_from_func = density(x_real, *Params)

现在构建我的分发类

class density_gen(rv_continuous):
    def _pdf(self, x, a_hat, b_hat, rho, m, sigma):
        return density(x, a_hat, b_hat, rho, m, sigma)

实例化

my_density = density_gen(name='density_gen')

my_density.a, my_density.b, my_density.numargs

正如我指定的 _pdf 我应该有一个工作分发实例

As I've specified _pdf I should have a working distribution instance

这有效

pdf = my_density._pdf(x_real, *Params)

cdf 也能正常工作,但速度非常慢

cdf works too albeit it's extremely slow

cdf = my_density._cdf(x_real, *Params)
my_density._cdf(0.1, *Params)

但是对于所有其他方法,我得到了 nans,例如

but for all the other methods I get nans, for instance

my_density.mean(*Params)    
my_density.ppf(0.01, *Params)

我在这里做错了什么?

推荐答案

看来您需要添加 _argcheck 方法到 density_gen,因为您的发行版使用自定义参数:

It appears you need to add the _argcheck method to density_gen, since your distribution uses custom parameters:

class density_gen(rv_continuous):

    def _argcheck(self, *Params):
        return True

    def _pdf(self, x, a_hat, b_hat, rho, m, sigma):
        return density(x, a_hat, b_hat, rho, m, sigma)

my_density = density_gen(name='density_gen')
pdf = my_density._pdf(x_real, *Params)
print(my_density.rvs(size=5, *Params))
print(my_density.mean(*Params))  
print(my_density.ppf(0.01, *Params))

但是,rvsmean等之后会很慢,大概是因为该方法每次需要生成随机数时都需要整合PDF或计算统计量.如果速度非常重要,则您需要向 density_gen 添加一个使用自己的采样器的 _rvs 方法.这方面的一个例子是我自己的 DensityInversionSampler,当仅给定 PDF 和采样域时,它通过数值反演生成随机数.

However, rvs, mean, and so on will then be very slow, presumably because the method needs to integrate the PDF every time it needs to generate a random number or calculate a statistic. If speed is at a premium, you will thus need to add to density_gen an _rvs method that uses its own sampler. An example of this is my own DensityInversionSampler, which generates random numbers by numerical inversion, when given only the PDF and the sampling domain.

这篇关于Python scipy rv_continuous 实现的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆