在Python-Pandas中,如何按特定的日期时间索引值对数据框进行子集化? [英] In Python-Pandas, How can I subset a dataframe by specific datetime index values?
本文介绍了在Python-Pandas中,如何按特定的日期时间索引值对数据框进行子集化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有很多这样的数据框....连续30分钟的行:
I have a dataframe of many days that look like this....consecutive rows of 30 min intervals:
a b
2006-05-08 09:30:00 10 13
2006-05-08 10:00:00 11 12
.
.
.
2006-05-08 15:30:00 15 14
2006-05-08 16:00:00 16 15
但是,我只关心某些特定时间,所以我希望df的每一天都像这样:
However, I only care about certain specific times, so I want EVERY DAY of the df to look like:
2006-05-08 09:30:00 10 13
2006-05-08 11:30:00 14 15
2006-05-08 13:00:00 18 15
2006-05-08 16:00:00 16 15
意思是,我只想保留数据帧中所有不同日期的行(16、13、11:30、9:30).
Meaning, I just want to keep the rows at times (16, 13, 11:30, 9:30), for all the different days in the dataframe.
谢谢
更新:
我在使用方面取得了一些进步
I made a bit of progress, using
hour = df.index.hour
selector = ((hour == 16) | (hour == 13) | (hour == 11) | (hour == 9))
df = df[selector]
但是,我也需要考虑分钟数,所以我尝试了:
However, I need to account for the minutes too, so I tried:
minute = df.index.minute
selector = ((hour == 16) & (minute == 0) | (hour == 3) & (minute == 0) | (hour == 9) & (minute == 30) | (hour == 12) & (minute == 0))
但是我得到了错误:
ValueError: operands could not be broadcast together with shapes (96310,) (16500,)
推荐答案
import numpy as np
import pandas as pd
N = 100
df = pd.DataFrame(range(N), index=pd.date_range('2000-1-1', freq='30T',
periods=N))
mask = np.in1d((df.index.hour)*100+(df.index.minute), [930, 1130, 1300, 1600])
print(df.loc[mask])
收益
0
2000-01-01 09:30:00 19
2000-01-01 11:30:00 23
2000-01-01 13:00:00 26
2000-01-01 16:00:00 32
2000-01-02 09:30:00 67
2000-01-02 11:30:00 71
2000-01-02 13:00:00 74
2000-01-02 16:00:00 80
这篇关于在Python-Pandas中,如何按特定的日期时间索引值对数据框进行子集化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文