Pandas 计算机每小时平均值并设置在间隔中间 [英] Pandas computer hourly average and set at middle of interval

查看:38
本文介绍了Pandas 计算机每小时平均值并设置在间隔中间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想计算风速和风向时间序列的每小时平均值,但我想将时间设置为半小时.因此,从 14:00 到 15:00 的平均值将在 14:30.现在,我似乎只能在间隔的左侧或右侧得到它.这是我目前拥有的:

I want to compute the hourly mean for a time series of wind speed and direction, but I want to set the time at the half hour. So, the average for values from 14:00 to 15:00 will be at 14:30. Right now, I can only seem to get it on left or right of the interval. Here is what I currently have:

ts_g=[item.replace(second=0, microsecond=0) for item in dates_g]
dg = {'ws': data_g.ws, 'wdir': data_g.wdir}
df_g = pandas.DataFrame(data=dg, index=ts_g, columns=['ws','wdir'])
grouped_g = df_g.groupby(pandas.TimeGrouper('H'))
hourly_ws_g = grouped_g['ws'].mean()
hourly_wdir_g = grouped_g['wdir'].mean()

这个输出看起来像:

2016-04-08 06:00:00+00:00     46.980000
2016-04-08 07:00:00+00:00     64.313333
2016-04-08 08:00:00+00:00     75.678333
2016-04-08 09:00:00+00:00    127.383333
2016-04-08 10:00:00+00:00    145.950000
2016-04-08 11:00:00+00:00    184.166667
....

但我希望它是这样的:

2016-04-08 06:30:00+00:00     54.556
2016-04-08 07:30:00+00:00     78.001
....

感谢您的帮助!

推荐答案

所以最简单的方法是重新采样,然后使用线性插值:

So the easiest way is to resample and then use linear interpolation:

In [21]: rng = pd.date_range('1/1/2011', periods=72, freq='H')

In [22]: ts = pd.Series(np.random.randn(len(rng)), index=rng)
    ...: 

In [23]: ts.head()
Out[23]: 
2011-01-01 00:00:00    0.796704
2011-01-01 01:00:00   -1.153179
2011-01-01 02:00:00   -1.919475
2011-01-01 03:00:00    0.082413
2011-01-01 04:00:00   -0.397434
Freq: H, dtype: float64

In [24]: ts2 = ts.resample('30T').interpolate()

In [25]: ts2.head()
Out[25]: 
2011-01-01 00:00:00    0.796704
2011-01-01 00:30:00   -0.178237
2011-01-01 01:00:00   -1.153179
2011-01-01 01:30:00   -1.536327
2011-01-01 02:00:00   -1.919475
Freq: 30T, dtype: float64

In [26]: 

我相信这就是您所需要的.

I believe this is what you need.

也许在没有随机数据的情况下更容易看到发生了什么:

Perhaps it's easier to see what's going on without random Data:

In [29]: ts.head()
Out[29]: 
2011-01-01 00:00:00    0
2011-01-01 01:00:00    1
2011-01-01 02:00:00    2
2011-01-01 03:00:00    3
2011-01-01 04:00:00    4
Freq: H, dtype: int64

In [30]: ts2 = ts.resample('30T').interpolate()

In [31]: ts2.head()
Out[31]: 
2011-01-01 00:00:00    0.0
2011-01-01 00:30:00    0.5
2011-01-01 01:00:00    1.0
2011-01-01 01:30:00    1.5
2011-01-01 02:00:00    2.0
Freq: 30T, dtype: float64

这篇关于Pandas 计算机每小时平均值并设置在间隔中间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆