ValueError:应用滚动("2H").mean()时索引必须是单调的 [英] ValueError: index must be monotonic when applying rolling("2H").mean()
本文介绍了ValueError:应用滚动("2H").mean()时索引必须是单调的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下DataFrame df
:
I have the following DataFrame df
:
TIME DELAY
0 2016-01-01 06:30:00 0
1 2016-01-01 14:10:00 2
2 2016-01-01 07:05:00 2
3 2016-01-01 11:00:00 1
4 2016-01-01 10:40:00 0
5 2016-01-01 08:10:00 7
6 2016-01-01 11:35:00 2
7 2016-01-02 13:50:00 2
8 2016-01-02 14:50:00 4
9 2016-01-02 14:05:00 1
请注意,该行未按日期时间对象排序.
Please notice that row are not sorted by a datetime object.
对于每一行,我想知道最近2个小时的平均延迟.为此,我执行了以下代码:
For each row I want to know the average delay for the last 2 hours. To do this task, I executed the following code:
df.index = pd.DatetimeIndex(df["TIME"])
df["DELAY_LAST2HOURS"] = df["DELAY"].rolling("2H").mean()
但是我遇到了这个错误:
However I got this error:
ValueError: index must be monotonic
如何正确解决任务?
推荐答案
Problem is DatetimeIndex
is not sorted, so need DataFrame.sort_index
:
df.index = pd.DatetimeIndex(df["TIME"])
df = df.sort_index()
df["DELAY_LAST2HOURS"] = df["DELAY"].rolling("2H").mean()
print (df)
TIME DELAY DELAY_LAST2HOURS
TIME
2016-01-01 06:30:00 2016-01-01 06:30:00 0 0.000000
2016-01-01 07:05:00 2016-01-01 07:05:00 2 1.000000
2016-01-01 08:10:00 2016-01-01 08:10:00 7 3.000000
2016-01-01 10:40:00 2016-01-01 10:40:00 0 0.000000
2016-01-01 11:00:00 2016-01-01 11:00:00 1 0.500000
2016-01-01 11:35:00 2016-01-01 11:35:00 2 1.000000
2016-01-01 14:10:00 2016-01-01 14:10:00 2 2.000000
2016-01-02 13:50:00 2016-01-02 13:50:00 2 2.000000
2016-01-02 14:05:00 2016-01-02 14:05:00 1 1.500000
2016-01-02 14:50:00 2016-01-02 14:50:00 4 2.333333
如果没有必要,所有内容都应合并为原始的TIME
列:
All together should be if not necessary original TIME
column:
df["TIME"] = pd.to_datetime(df["TIME"])
df = df.set_index('TIME').sort_index()
df["DELAY_LAST2HOURS"] = df["DELAY"].rolling("2H").mean()
print (df)
DELAY DELAY_LAST2HOURS
TIME
2016-01-01 06:30:00 0 0.000000
2016-01-01 07:05:00 2 1.000000
2016-01-01 08:10:00 7 3.000000
2016-01-01 10:40:00 0 0.000000
2016-01-01 11:00:00 1 0.500000
2016-01-01 11:35:00 2 1.000000
2016-01-01 14:10:00 2 2.000000
2016-01-02 13:50:00 2 2.000000
2016-01-02 14:05:00 1 1.500000
2016-01-02 14:50:00 4 2.333333
df["TIME"] = pd.to_datetime(df["TIME"])
df = df.sort_values('TIME').set_index('TIME')
df["DELAY_LAST2HOURS"] = df["DELAY"].rolling("2H").mean()
这篇关于ValueError:应用滚动("2H").mean()时索引必须是单调的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文