划分一系列关于 pandas 的时间间隔吗? [英] Split a series on time gaps in pandas?
问题描述
是否可以按时间间隔划分时间序列.例如,假设我们有以下内容:
Is it possible to split a time series on it's gaps. For example, suppose we had the following:
rng2011 = pd.date_range('1/1/2011', periods=72, freq='H')
rng2012 = pd.date_range('1/1/2012', periods=72, freq='H')
Y = rng2011.union(rng2012)
是否有可能寻找一年或更长时间的差距,并拆分它们上的数据框?
Is it possible to look for gaps of a year or more, and split the data frame on them?
我想这会像这样:
Y.groupby(Y.map(lambda x: x.year))
除了在年日期上分开,而且我有兴趣指定间隔间隔而不是行的year属性.
Except that this splits on the year date, and I'm interested in specifying an interval gap rather than the year attribute of the row.
应用程序是我从gps那里获得了旅行日志,但是没有描述一次旅行何时结束而另一次旅行何时开始.我想分开十分钟或更长时间.
The application is I've got trip logs from a gps, but no delineation of when one trip ended and another began. I'd like to split on gaps of ten minutes or longer.
推荐答案
假设Y是数据框中的一列,一种方法是使用总和:
Assuming Y is a column in your dataframe, one way is to use diff
and cumsum:
df = DataFrame(Y)
df[1] = df[0].diff() > 600000000000.0 #nanoseconds in ten minutes
df[1] = df[1].apply(lambda x: 1 if x else 0).cumsum()
df.groupby(1)
注意:如果您使用72小时内的纳秒数,它将分为两组.
这篇关于划分一系列关于 pandas 的时间间隔吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!