在稀疏/不规则TimeSeries中计算 pandas 中的EWMA [英] Compute EWMA over sparse/irregular TimeSeries in Pandas

查看:266
本文介绍了在稀疏/不规则TimeSeries中计算 pandas 中的EWMA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出以下高频但稀疏的时间序列:

Given the following high-frequency but sparse time series:

#Sparse Timeseries
dti1 = pd.date_range(start=datetime(2015,8,1,9,0,0),periods=10,freq='ms')
dti2 = pd.date_range(start=datetime(2015,8,1,9,0,10),periods=10,freq='ms')
dti = dti1 + dti2

ts = pd.Series(index=dti, data=range(20))

我可以使用pandas函数计算半衰期为5ms的指数加权移动平均值,如下所示:

I can compute an exponentially weighted moving average with a halflife of 5ms using a pandas function as follows:

ema = pd.ewma(ts, halflife=5, freq='ms')

但是,在后台,该函数以1毫秒的间隔(这是我提供的频率")对我的时间序列进行重采样.这将导致成千上万的额外数据点包含在输出中.

However, under the hood, the function is resampling my timeseries with an interval of 1 ms (which is the 'freq' that I supplied). This causes thousands of additional datapoints to be included in the output.

In [118]: len(ts)
Out[118]: 20
In [119]: len(ema)
Out[119]: 10010

这是不可扩展的,因为我的真实时间序列包含成千上万个相隔数分钟或数小时的高频观测.

This is not scalable, as my real Timeseries contains hundreds of thousands of high-frequency observations that are minutes or hours apart.

是否有一种Pandas/numpy方式可在不重新采样的情况下为稀疏时间序列计算EMA?类似于以下内容: http://oroboro.com/irregular-ema/

Is there a Pandas/numpy way of computing an EMA for a sparse timeseries without resampling? Something similar to this: http://oroboro.com/irregular-ema/

或者,我必须写我自己的吗?谢谢!

Or, do i have to write my own? Thanks!

推荐答案

您可以使用reindexewma结果与原始序列对齐.

You can use reindex to align the ewma result with your original series.

pd.ewma(ts, halflife=5, freq='ms').reindex(ts.index)

2015-08-01 09:00:00.000     0.0000
2015-08-01 09:00:00.001     0.5346
2015-08-01 09:00:00.002     1.0921
2015-08-01 09:00:00.003     1.6724
2015-08-01 09:00:00.004     2.2750
2015-08-01 09:00:00.005     2.8996
2015-08-01 09:00:00.006     3.5458
2015-08-01 09:00:00.007     4.2131
2015-08-01 09:00:00.008     4.9008
2015-08-01 09:00:00.009     5.6083
2015-08-01 09:00:10.000    10.0000
2015-08-01 09:00:10.001    10.5346
2015-08-01 09:00:10.002    11.0921
2015-08-01 09:00:10.003    11.6724
2015-08-01 09:00:10.004    12.2750
2015-08-01 09:00:10.005    12.8996
2015-08-01 09:00:10.006    13.5458
2015-08-01 09:00:10.007    14.2131
2015-08-01 09:00:10.008    14.9008
2015-08-01 09:00:10.009    15.6083
dtype: float64

这篇关于在稀疏/不规则TimeSeries中计算 pandas 中的EWMA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆