为什么我的峰度函数不能产生与scipy.stats.kurtosis相同的输出? [英] Why is my Kurtosis function not producing the same output as scipy.stats.kurtosis?

查看:229
本文介绍了为什么我的峰度函数不能产生与scipy.stats.kurtosis相同的输出?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个家庭作业问题,应该在此处描述为Kurtosis编写一个函数:

I have a homework problem in which I'm supposed to write a function for Kurtosis as descirbed here:

分母中的theta是标准偏差(方差的平方根),分子中的x横杠是x的平均值.

The theta in the denominator is the standard deviation (square-root of the variance) and the x-with-the-bar in the numerator is the mean of x.

我已经实现了以下功能:

I've implemented the function as follows:

import numpy as np
from scipy.stats import kurtosis

testdata = np.array([1, 2, 3, 4, 5])

def mean(obs):
    return (1. / len(obs)) * np.sum(obs)

def variance(obs):
    return (1. / len(obs)) * np.sum((obs - mean(obs)) ** 2)

def kurt(obs):
    num = np.sqrt((1. / len(obs)) * np.sum((obs - mean(obs)) ** 4))
    denom = variance(obs) ** 2  # avoid losing precision with np.sqrt call
    return num / denom

前两个函数meanvariance分别成功地与numpy.meannumpy.var进行了交叉验证.

The first two functions, mean and variance were successfully cross-validated with numpy.mean and numpy.var, respectively.

我尝试使用以下语句对kurt进行交叉验证:

I attempted to cross-validate kurt with the following statement:

>>> kurtosis(testdata) == kurt(testdata)
False

这是两个峰度函数的输出:

Here's the output of both kurtosis functions:

>>> kurtosis(testdata)  # scipy.stats
-1.3

>>> kurt(testdata)  # my crappy attempt
0.65192024052026476

我哪里出错了? scipy.stats.kurtosis做的事情比我给出的公式更奇妙吗?

Where did I go wrong? Is scipy.stats.kurtosis doing something fancier than what's in the equation I've been given?

推荐答案

默认情况下,

By default, scipy.stats.kurtosis():

  1. 计算过量峰度(即从结果中减去3).
  2. 更正统计偏差(这会影响某些分母).
  1. Computes excess kurtosis (i.e. subtracts 3 from the result).
  2. Corrects for statistical biases (this affects some of the denominators).

这两种行为都可以通过scipy.stats.kurtosis()的可选参数进行配置.

Both behaviours are configurable through optional arguments to scipy.stats.kurtosis().

最后,您的方法中的np.sqrt()调用是不必要的,因为公式中没有平方根.一旦删除它,您的函数输出将与我从kurtosis(testdata, False, False)获得的结果匹配.

Finally, the np.sqrt() call in your method is unnecessary since there's no square root in the formula. Once I remove it, the output of your function matches what I get from kurtosis(testdata, False, False).

我尝试使用以下语句对kurt进行交叉验证

I attempted to cross-validate kurt with the following statement

您不应比较浮点数是否完全相等.即使这些数学公式相同,将它们转换成计算机代码的方式上的微小差异也会影响计算结果.

You shouldn't be comparing floating-point numbers for exact equality. Even if the mathematical formulae are the same, small differences in how they are translated into computer code could affect the result of the computation.

最后,如果您要编写数字代码,强烈建议阅读每位计算机科学家应了解的浮点算法.

Finally, if you're going to be writing numerical code, I strongly recommend reading What Every Computer Scientist Should Know About Floating-Point Arithmetic.

P.S.这是我使用的功能:

P.S. This is the function I've used:

In [51]: def kurt(obs):
   ....:     num = np.sum((obs - mean(obs)) ** 4)/ len(obs)
   ....:     denom = variance(obs) ** 2  # avoid losing precision with np.sqrt call
   ....:     return num / denom

这篇关于为什么我的峰度函数不能产生与scipy.stats.kurtosis相同的输出?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆