在python中汇总时间序列 [英] Aggregate time series in python

查看:94
本文介绍了在python中汇总时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们如何按小时或分钟粒度汇总时间序列?如果我有一个类似以下的时间序列,则希望按小时汇总这些值.熊猫是否支持它?或者在python中有一种漂亮的方法吗?

How do we aggregate the time series by hour or minutely granularity? If I have a time series like the following then I want the values to be aggregated by hour. Does pandas support it or is there a nifty way to do it in python?

timestamp, value
2012-04-30T22:25:31+00:00, 1
2012-04-30T22:25:43+00:00, 1
2012-04-30T22:29:04+00:00, 2
2012-04-30T22:35:09+00:00, 4
2012-04-30T22:39:28+00:00, 1
2012-04-30T22:47:54+00:00, 8
2012-04-30T22:50:49+00:00, 9
2012-04-30T22:51:57+00:00, 1
2012-04-30T22:54:50+00:00, 1
2012-04-30T22:57:22+00:00, 0
2012-04-30T22:58:38+00:00, 7
2012-04-30T23:05:21+00:00, 1
2012-04-30T23:08:56+00:00, 1

我还尝试通过调用以下命令来确保自己的数据框中的数据类型正确:

I also tried to make sure I have the correct data types in my data frame by calling:

  print data_frame.dtypes

我得到以下结果

ts     datetime64[ns]
val             int64

当我在数据框上呼叫分组依据时

When I call group by on the data frame

grouped = data_frame.groupby(lambda x: x.minute)

我收到以下错误:

grouped = data_frame.groupby(lambda x: x.minute)
AttributeError: 'int' object has no attribute 'minute'

推荐答案

http://pandas.pydata.org/pandas-docs/dev/genic/pandas.DataFrame.resample.html DataFrame.resample方法.您可以在此处指定sum的汇总方式.

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.DataFrame.resample.html DataFrame.resample method. You can specify here way of aggregation, in your case sum.

data_frame.resample("1Min", how="sum")

http://pandas.pydata.org/pandas-docs/dev/timeseries.html#up-and-downsampling

这篇关于在python中汇总时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆