Pandas dataframe groupby 计算总体标准差 [英] Pandas dataframe groupby to calculate population standard deviation

查看：279 发布时间：2021/6/13 20:34:57 python numpy pandas statistics

本文介绍了Pandas dataframe groupby 计算总体标准差的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 groupby 和 np.std 来计算标准差，但它似乎在计算样本标准差(自由度等于 1).

I am trying to use groupby and np.std to calculate a standard deviation, but it seems to be calculating a sample standard deviation (with a degrees of freedom equal to 1).

这是一个示例.

#create dataframe
>>> df = pd.DataFrame({'A':[1,1,2,2],'B':[1,2,1,2],'values':np.arange(10,30,5)})
>>> df
   A  B  values
0  1  1      10
1  1  2      15
2  2  1      20
3  2  2      25

#calculate standard deviation using groupby
>>> df.groupby('A').agg(np.std)
      B    values
A                    
1  0.707107  3.535534
2  0.707107  3.535534

#Calculate using numpy (np.std)
>>> np.std([10,15],ddof=0)
2.5
>>> np.std([10,15],ddof=1)
3.5355339059327378

有没有办法在 groupby 语句中使用人口标准计算(ddof=0)?我使用的记录不是(不是上面的示例表)不是样本，所以我只对总体标准偏差感兴趣.

Is there a way to use the population std calculation (ddof=0) with the groupby statement? The records I am using are not (not the example table above) are not samples, so I am only interested in population std deviations.

推荐答案

您可以在 agg 函数中向 np.std 传递额外的参数:

You can pass additional args to np.std in the agg function:

In [202]:

df.groupby('A').agg(np.std, ddof=0)

Out[202]:
     B  values
A             
1  0.5     2.5
2  0.5     2.5

In [203]:

df.groupby('A').agg(np.std, ddof=1)

Out[203]:
          B    values
A                    
1  0.707107  3.535534
2  0.707107  3.535534

这篇关于Pandas dataframe groupby 计算总体标准差的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Pandas dataframe groupby 计算总体标准差 [英] Pandas dataframe groupby to calculate population standard deviation

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Pandas dataframe groupby 计算总体标准差 [英] Pandas dataframe groupby to calculate population standard deviation

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭