大 pandas 如何计算偏斜 [英] How does pandas calculate skew

查看:69
本文介绍了大 pandas 如何计算偏斜的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在计算一个 coskew 矩阵,并想用 skew 方法中内置的 Pandas 仔细检查我的计算.我无法协调熊猫如何执行计算.

I'm calculating a coskew matrix and wanted to double check my calculation with pandas built in skew method. I could not reconcile how pandas performing the calculation.

将我的系列定义为:

import pandas as pd

series = pd.Series(
    {0: -0.051917457635120283,
     1: -0.070071606515280632,
     2: -0.11204865874074735,
     3: -0.14679988245503134,
     4: -0.088062467095565145,
     5: 0.17579741198527793,
     6: -0.10765856028420773,
     7: -0.11971470229167547,
     8: -0.15169210769159247,
     9: -0.038616800990881606,
     10: 0.16988162977411481,
     11: 0.092999418364443032}
)

我比较了以下计算,并希望它们相同.

I compared the following calculations and expected them to be the same.

series.skew()

1.1119637586658944

(((series - series.mean()) / series.std(ddof=0)) ** 3).mean()

0.967840223081231

我 - 拿 2

这明显不同.我认为可能是 Fisher-Pearson 系数.所以我做到了:

n = len(series)
skew = series.sub(series.mean()).div(series.std(ddof=0)).apply(lambda x: x ** 3).mean()
skew * (n * (n - 1)) ** 0.5 / (n - 1)

1.0108761442417222

还有很多.

pandas 如何计算偏斜?

How does pandas calculate skew?

推荐答案

我发现了 scipy.stats.skew 与参数 bias=False 返回相等的输出,所以我认为在 pandasskew 默认为 bias=False:

I found scipy.stats.skew with parameter bias=False return equal output, so I think in pandas skew is bias=False by default:

偏差:布尔

如果为 False,则根据统计偏差对计算进行校正.

If False, then the calculations are corrected for statistical bias.

import pandas as pd
import scipy.stats.stats as stats

series = pd.Series(
    {0: -0.051917457635120283,
     1: -0.070071606515280632,
     2: -0.11204865874074735,
     3: -0.14679988245503134,
     4: -0.088062467095565145,
     5: 0.17579741198527793,
     6: -0.10765856028420773,
     7: -0.11971470229167547,
     8: -0.15169210769159247,
     9: -0.038616800990881606,
     10: 0.16988162977411481,
     11: 0.092999418364443032}
)

print (series.skew())
1.11196375867

print (stats.skew(series, bias=False))
1.1119637586658944

不确定 100%,但我想我在 代码

Not sure for 100%, but I think I find it in code

编辑(piRSquared)

EDIT (piRSquared)

来自 scipy 歪斜代码

if not bias:
    can_correct = (n > 2) & (m2 > 0)
    if can_correct.any():
        m2 = np.extract(can_correct, m2)
        m3 = np.extract(can_correct, m3)
        nval = ma.sqrt((n-1.0)*n)/(n-2.0)*m3/m2**1.5
        np.place(vals, can_correct, nval)
return vals

调整是 (n * (n - 1)) ** 0.5/(n - 2) 而不是 (n * (n - 1)) ** 0.5/(n - 1)

这篇关于大 pandas 如何计算偏斜的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆