生成给定峰度或偏度的数字(分布) [英] Generating numbers (distribution) for a given Kurtosis or skewness

查看:167
本文介绍了生成给定峰度或偏度的数字(分布)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对在xls中使用统计功能不熟悉.给定一组数字,我能够使用xls中的KURT函数计算峰度或偏度.

I am new to using Statistical functions in xls. I am able to the KURT function in xls to calculate the Kurtosis or Skewness, given a set of numbers.

但是我的要求是以另一种方式进行操作,例如对于给定的偏度或峰度,是否可以生成随机数.有关如何做到这一点的任何指示.

But my requirement is to do it in the other way, like for a given Skewness or Kurtosis, is there a way to generate random numbers. Any pointers on how to do that.

该函数应将偏度或峰度值作为输入,并应生成50个随机数,其中最小为1,最大为100,000.

The function should take the skewness or Kurtosis value as input, and it should generate 50 random numbers with 1 being minimum and 100,000 being maximum.

如果Excel无法解决问题,我正在使用Python寻找建议.

If Excel does not have a way, I am looking for suggestions in Python.

可以帮我在Excel或Python中执行此操作吗?

Can you please help me how to do this in Excel or Python?

推荐答案

尝试了几种分布后,广义Gamma分布似乎足够灵活,可以将偏斜或峰度调整到所需的值值,但不能像

After experimenting with several distributions, the generalised Gamma distribution seems to be flexible enough to adjust either the skew or the kurtosis to the desired value, but not both at the same time like what was asked in the question @gabriel mentioned in his comment.

因此,要使用单个固定矩从g-Gamma分布中抽取样本,可以使用

So to draw a sample out of a g-Gamma distribution with a single fixed moment, you can use scipy.optimize to find a distribution with minimizes a penalty function (I chose (target - value) ** 2)

from scipy import stats, optimize
import numpy as np

def random_by_moment(moment, value, size):
    """ Draw `size` samples out of a generalised Gamma distribution
    where a given moment has a given value """
    assert moment in 'mvsk', "'{}' invalid moment. Use 'm' for mean,"\
            "'v' for variance, 's' for skew and 'k' for kurtosis".format(moment)
    def gengamma_error(a):
        m, v, s, k = (stats.gengamma.stats(a[0], a[1], moments="mvsk"))
        moments = {'m': m, 'v': v, 's': s, 'k': k}
        return (moments[moment] - value) ** 2    # has its minimum at the desired value      

    a, c = optimize.minimize(gengamma_error, (1, 1)).x    
    return stats.gengamma.rvs(a, c, size=size)

n = random_by_moment('k', 3, 100000)
# test if result is correct
print("mean={}, var={}, skew={}, kurt={}".format(np.mean(n), np.var(n), stats.skew(n), stats.kurtosis(n)))

在此之前,我想出了一个匹配偏斜峰度的功能.但是,即使g-Gamma也不够灵活,无法达到此目的,具体取决于您的病情极端情况

Before that I came up with a function that matches skew and kurtosis. However even the g-Gamma is not flexible enough to serve this purpose depending on how extreme your conditions are

def random_by_sk(skew, kurt, size):
    def gengamma_error(a):
        s, k = (stats.gengamma.stats(a[0], a[1], moments="sk"))
        return (s - skew) ** 2 + (k - kurt) ** 2  # penalty equally weighted for skew and kurtosis

    a, c = optimize.minimize(gengamma_error, (1, 1)).x    
    return stats.gengamma.rvs(a, c, size=size)

n = random_by_sk(3, 3, 100000)
print("mean={}, var={}, skew={}, kurt={}".format(np.mean(n), np.var(n), stats.skew(n), stats.kurtosis(n)))
# will yield skew ~2 and kurtosis ~3 instead of 3, 3

这篇关于生成给定峰度或偏度的数字(分布)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆