按时间计算 DataFrame 的 EWMA [英] computing an EWMA of a DataFrame by time

查看:75
本文介绍了按时间计算 DataFrame 的 EWMA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个数据框:

    avg                date    high  low      qty
0 16.92 2013-05-27 00:00:00   19.00 1.22 71151.00
1 14.84 2013-05-30 00:00:00   19.00 1.22 42939.00
2  9.19 2013-06-02 00:00:00   17.20 1.23  5607.00
3 23.63 2013-06-05 00:00:00 5000.00 1.22  5850.00
4 13.82 2013-06-10 00:00:00   19.36 1.22  5644.00
5 17.76 2013-06-15 00:00:00   24.00 2.02 16969.00

每一行都是在指定日期创建的平均、最高、最低和数量的观察值.

Each row is an observation of avg, high, low, and qty that was created on the specified date.

我正在尝试计算跨度为 60 天的指数移动加权平均值:

I'm trying to compute an exponential moving weighted average with a span of 60 days:

df["emwa"] = pandas.ewma(df["avg"],span=60,freq="D")

但我明白

TypeError: Only valid with DatetimeIndex or PeriodIndex

好的,所以也许我需要在构造 DataFrame 时向它添加一个 DateTimeIndex.让我改变我的构造函数调用

Okay, so maybe I need to add a DateTimeIndex to my DataFrame when it's constructed. Let me change my constructor call from

df = pandas.DataFrame(records) #records is just a list of dictionaries

rng = pandas.date_range(firstDate,lastDate, freq='D')
df = pandas.DataFrame(records,index=rng)

但现在我明白了

ValueError: Shape of passed values is (5,), indices imply (5, 1641601)

对于如何计算我的 EMWA 有什么建议吗?

Any suggestions for how to compute my EMWA?

推荐答案

您需要做两件事,确保日期列是日期(而不是字符串)并将索引设置为这些日期.
您可以使用 to_datetime 一次性完成此操作:

You need two things, ensure the date column is of dates (rather of strings) and to set the index to these dates.
You can do this in one go using to_datetime:

In [11]: df.index = pd.to_datetime(df.pop('date'))

In [12]: df
Out[12]:
              avg     high   low    qty
date
2013-05-27  16.92    19.00  1.22  71151
2013-05-30  14.84    19.00  1.22  42939
2013-06-02   9.19    17.20  1.23   5607
2013-06-05  23.63  5000.00  1.22   5850
2013-06-10  13.82    19.36  1.22   5644
2013-06-15  17.76    24.00  2.02  16969

然后你可以调用emwa 符合预期:

Then you can call emwa as expected:

In [13]: pd.ewma(df["avg"], span=60, freq="D")
Out[13]:
date
2013-05-27    16.920000
2013-05-28    16.920000
2013-05-29    16.920000
2013-05-30    15.862667
2013-05-31    15.862667
2013-06-01    15.862667
2013-06-02    13.563899
2013-06-03    13.563899
2013-06-04    13.563899
2013-06-05    16.207625
2013-06-06    16.207625
2013-06-07    16.207625
2013-06-08    16.207625
2013-06-09    16.207625
2013-06-10    15.697743
2013-06-11    15.697743
2013-06-12    15.697743
2013-06-13    15.697743
2013-06-14    15.697743
2013-06-15    16.070721
Freq: D, dtype: float64

如果您将其设置为一列:

and if you set this as a column:

In [14]: df['ewma'] = pd.ewma(df["avg"], span=60, freq="D")

In [15]: df
Out[15]:
              avg     high   low    qty       ewma
date
2013-05-27  16.92    19.00  1.22  71151  16.920000
2013-05-30  14.84    19.00  1.22  42939  15.862667
2013-06-02   9.19    17.20  1.23   5607  13.563899
2013-06-05  23.63  5000.00  1.22   5850  16.207625
2013-06-10  13.82    19.36  1.22   5644  15.697743
2013-06-15  17.76    24.00  2.02  16969  16.070721

这篇关于按时间计算 DataFrame 的 EWMA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆