将日期与 Pandas 中的多列进行比较 [英] Comparing date with multiple columns in Pandas
本文介绍了将日期与 Pandas 中的多列进行比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含 6 列的数据框,我想使用其中包含日期的 5 列(即第一次旅行、第二次旅行、第三次旅行).从这 5 列中,我想考虑最大日期并将其与给定日期2020-09-25 00:00:00"
I have a dataframe with 6 columns and I wanted to use 5 columns of it (i.e FIRST TRAVEL, SECOND TRAVEL, THIRD TRAVEL) which contains date in it. From these 5 columns, I wanted to consider the greatest date and compare it against the given date "2020-09-25 00:00:00"
必须满足以下条件:
- 如果日期大于输入日期,则很好.如果不是,我们将不得不写为Offer Expired";在名为 RESULT 的新列中.
,
Customer Name FIRST TRAVEL SECOND TRAVEL THIRD TRAVEL FOURTH TRAVEL FIFTH TRAVEL RESULT
0 USER1 NaT 2020-09-02 08:21:59 NaT NaT NaT Offer Expired
1 USER2 NaT 2014-11-05 15:23:38 NaT NaT NaT Offer Expired
2 USER3 NaT NaT NaT NaT NaT NaN
3 USER4 NaT NaT NaT NaT NaT NaN
4 USER5 NaT NaT NaT NaT NaT NaN
5 USER6 NaT NaT NaT NaT NaT NaN
6 USER7 NaT NaT NaT NaT NaT NaN
7 USER8 NaT NaT NaT NaT NaT NaN
8 USER9 NaT 2020-09-02 10:07:11 NaT NaT NaT Offer Expired
9 USER10 2020-03-16 00:00:00 NaT NaT NaT NaT Offer Expired
10 USER11 2019-12-11 00:00:00 NaT NaT NaT NaT Offer Expired
11 USER12 2020-09-26 00:00:00 2020-04-14 00:00:00 NaT NaT NaT NaN
12 USER13 2020-04-20 00:00:00 2019-10-18 00:00:00 NaT NaT NaT Offer Expired
13 USER14 2020-02-21 00:00:00 2020-04-20 00:00:00 NaT NaT NaT Offer Expired
14 USER15 2020-01-17 00:00:00 2019-10-17 00:00:00 NaT NaT NaT Offer Expired
15 USER16 NaT 2020-04-20 00:00:00 NaT NaT NaT Offer Expired
16 USER17 NaT 2019-08-24 00:00:00 NaT NaT NaT Offer Expired
17 USER18 NaT 2019-11-01 00:00:00 NaT NaT NaT Offer Expired
18 USER19 NaT 2019-09-13 00:00:00 NaT NaT NaT Offer Expired
19 USER20 NaT 2020-01-13 00:00:00 NaT NaT NaT Offer Expired
20 USER21 NaT 2019-09-13 00:00:00 NaT NaT NaT Offer Expired
21 USER22 NaT 2020-04-20 00:00:00 NaT NaT NaT Offer Expired
22 USER23 NaT 2020-02-12 00:00:00 NaT NaT NaT Offer Expired
23 USER24 NaT 2019-10-18 00:00:00 NaT NaT NaT Offer Expired
24 USER25 2020-09-06 22:09:22 2020-04-07 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
25 USER26 2020-09-06 22:09:22 2020-04-21 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
26 USER27 NaT NaT NaT 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
27 USER28 NaT NaT NaT 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
28 USER29 2020-09-06 22:09:22 2020-04-17 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
29 USER30 2020-09-06 22:09:22 NaT NaT NaT 2020-06-03 19:45:36 Offer Expired
30 USER31 2020-09-06 22:09:22 NaT NaT NaT 2020-06-03 19:45:36 Offer Expired
31 USER32 2020-09-06 22:09:22 NaT NaT NaT 2020-06-03 19:45:36 Offer Expired
32 USER33 2020-09-06 22:09:22 NaT NaT NaT 2020-06-03 19:45:36 Offer Expired
33 USER34 2020-09-06 22:09:22 2020-10-27 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 NaN
34 USER35 2020-09-06 22:09:22 2019-06-18 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
35 USER36 2020-09-06 22:09:22 2020-04-15 00:00:00 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
36 USER37 2020-09-06 22:09:22 2020-09-04 15:29:45 2020-08-28 10:17:50 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
37 USER38 2020-09-06 22:09:22 NaT NaT 2020-09-25 17:03:20 2020-06-03 19:45:36 NaN
38 USER39 NaT NaT NaT 2020-09-04 17:03:20 2020-06-03 19:45:36 Offer Expired
注意:这在 Excel 中更简单,我们可以使用以下公式.但是,我找不到做这个 Pandas 的方法.
Note: This has been simpler in Excel where we can use the below formula. However, I couldn't find a way to do this Pandas.
=IF(COUNTBLANK($B2:$F2)=5,"", IF(MAX($B2:$F2)>$H$1,"","Offer Expired");))
感谢任何帮助.
推荐答案
这可以解决问题(内嵌评论)
This will do the trick (comments inline)
import numpy as np
import pandas as pd
# I'm assuming all the relevant columns are already converted:
dt = pd.to_datetime("2020-09-25 00:00:00")
# you need to indicate somehow the columns to compare - using regex on column names:
dftravels = df.filter(regex=".* TRAVEL$", axis=1)
# NaT and any logical 2 argument operation on it always will evaluate to False
# hence you check only ones where there's not a single date after dt and where there's at least one date overall
df["Result"] = np.where(~dftravels.gt(dt).any(axis=1) & dftravels.any(axis=1), "Offer expired", "")
这篇关于将日期与 Pandas 中的多列进行比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文