根据python中的条件删除日期 [英] Drop dates based on condition in python
本文介绍了根据python中的条件删除日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试实现以下条件:如果错误值的计数大于2(在下面的示例中为2019-05-17和2019-05-20),则该日期为完整日期(所有时间段)删除
I'm trying to implement a condition where if the count of incorrect values is greater than 2 (2019-05-17 & 2019-05-20 in the example below) then the complete date (all the time blocks) is removed
输入
t_value C/IC
2019-05-17 00:00:00 0 incorrect
2019-05-17 01:00:00 0 incorrect
2019-05-17 02:00:00 0 incorrect
2019-05-17 03:00:00 4 correct
2019-05-17 04:00:00 5 correct
2019-05-18 01:00:00 0 incorrect
2019-05-18 02:00:00 6 correct
2019-05-18 03:00:00 7 correct
2019-05-19 04:00:00 0 incorrect
2019-05-19 09:00:00 0 incorrect
2019-05-19 11:00:00 8 correct
2019-05-20 07:00:00 2 correct
2019-05-20 08:00:00 0 incorrect
2019-05-20 09:00:00 0 incorrect
2019-05-20 07:00:00 0 incorrect
所需的输出
t_value C/IC
2019-05-18 01:00:00 0 incorrect
2019-05-18 02:00:00 6 correct
2019-05-18 03:00:00 7 correct
2019-05-19 04:00:00 0 incorrect
2019-05-19 09:00:00 0 incorrect
2019-05-19 11:00:00 8 correct
我不确定执行哪个基于时间的操作来获得所需的结果。谢谢
I'm not sure which time based operation to perform to get the desired result. Thanks
推荐答案
#read in data
df = pd.read_csv(StringIO(data),sep='\s{2,}', engine='python')
#give index a name
df.index.name = 'Date'
#convert to datetime
#and sort index
#usually safer to sort datetime index in Pandas
df.index = pd.to_datetime(df.index)
df = df.sort_index()
res = (df
#group by date and c/ic
.groupby([pd.Grouper(freq='1D',level='Date'),"C/IC"])
.size()
#get rows greater than 2 and incorrect
.loc[lambda x: x>2,"incorrect"]
#keep only the date index
.droplevel(-1)
.index
#datetime information trapped here
#and due to grouping, it is different from initial datetime
#as such, we convert to string
#and build another batch of dates
.astype(str)
.tolist()
)
res
['2019-05-17', '2019-05-20']
#build a numpy array of dates
idx = np.array(res, dtype='datetime64')
#exclude dates in idx and get final value
#aim is to get dates, irrespective of time
df.loc[~np.isin(df.index.date,idx)]
t_value C/IC
Date
2019-05-18 01:00:00 0 incorrect
2019-05-18 02:00:00 6 correct
2019-05-18 03:00:00 7 correct
2019-05-19 04:00:00 0 incorrect
2019-05-19 09:00:00 0 incorrect
2019-05-19 11:00:00 8 correct
这篇关于根据python中的条件删除日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文