计算每个pandas.DataFrame的列的numpy.std? [英] Calculate numpy.std of each pandas.DataFrame's column?

查看:435
本文介绍了计算每个pandas.DataFrame的列的numpy.std?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想获取我的pandas.DataFrame每列的numpy.std.

这是我的代码:

import pandas as pd
import numpy as np

prices = pd.DataFrame([[-0.33333333, -0.25343423, -0.1666666667],
                       [+0.23432323, +0.14285714, -0.0769230769],
                       [+0.42857143, +0.07692308, +0.1818181818]])

print(pd.DataFrame(prices.std(axis=0)))

这是我的代码的输出:

pd.DataFrame([[ 0.39590933],
              [ 0.21234018],
              [ 0.1809432 ]])

这是正确的输出(如果使用np.std计算)

And here is the right output (if calculate with np.std)

pd.DataFrame([[ 0.32325862],
              [ 0.17337503],
              [ 0.1477395 ]])

我为什么有这种区别? 我该如何解决?

Why am I having such difference? How can I fix that?

注意 :我试图这样做:

NOTE: I have tried to do this way:

print(np.std(prices, axis=0))

但是我遇到了以下错误:

But I had the following error:

Traceback (most recent call last):
  File "C:\Users\*****\Documents\******\******\****.py", line 10, in <module>
    print(np.std(prices, axis=0))
  File "C:\Python33\lib\site-packages\numpy\core\fromnumeric.py", line 2812, in std
    return std(axis=axis, dtype=dtype, out=out, ddof=ddof)
TypeError: std() got an unexpected keyword argument 'dtype'

谢谢!

推荐答案

它们都是正确的:它们只是默认的自由度增量有所不同. np.std 使用0,而 DataFrame.std 使用1:

They're both right: they just differ on what the default delta degrees of freedom is. np.std uses 0, and DataFrame.std uses 1:

>>> prices.std(axis=0, ddof=0)
0    0.323259
1    0.173375
2    0.147740
dtype: float64
>>> prices.std(axis=0, ddof=1)
0    0.395909
1    0.212340
2    0.180943
dtype: float64
>>> np.std(prices.values, axis=0, ddof=0)
array([ 0.32325862,  0.17337503,  0.1477395 ])
>>> np.std(prices.values, axis=0, ddof=1)
array([ 0.39590933,  0.21234018,  0.1809432 ])

这篇关于计算每个pandas.DataFrame的列的numpy.std?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆