根据条件使用 pandas 排除特定日期 [英] Exclude a specific date based on a condition using pandas

查看:77
本文介绍了根据条件使用 pandas 排除特定日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

df2 = pd.DataFrame({'person_id':[11,11,11,11,11,12,12,13,13,14,14,14,14],
                    'admit_date':['01/01/2011','01/01/2009','12/31/2013','12/31/2017','04/03/2014','08/04/2016',
                                  '03/05/2014','02/07/2011','08/08/2016','12/31/2017','05/01/2011','05/21/2014','07/12/2016']})
df2 = df2.melt('person_id', value_name='dates')
df2['dates'] = pd.to_datetime(df2['dates'])

我想做的是

a)如果主题具有 12月31日 1月1日在其记录中。请注意,无关紧要。

a) Exclude/filter out records from the data frame if a subject has Dec 31st and Jan 1st in its records. Please note that year doesn't matter.

如果受试者的任一 12月31日 1月1日,我们将其保留不变

If a subject has either Dec 31st or Jan 1st, we leave them as is.

但是如果它们都 12月31日 1月1日,我们将其中的一个(12月31日或1月1日)删除。请注意,他们也可能有多个具有相同日期的条目。像 person_id = 11

But if they have both Dec 31st and Jan 1st, we remove one (either Dec 31st or Jan 1st) of them. note they could have multiple entries with the same date as well. Like person_id = 11

我只能做下面的事情

df2_new =  df2['dates'] != '2017-12-31'  #but this excludes if a subject has only `Dec 31st on 2017`. How can I ignore the dates and not consider `year`
df2[df2_new]  

我的预期输出是如下所示

My expected output is like as shown below

对于person_id = 11,我们删除 12-31 ,因为它们在它们的同时具有 12-31 01-01 记录,而对于person_id = 14,我们不会删除 12-31 ,因为它的记录中只有 12-31 记录。

For person_id = 11, we drop 12-31 because it had both 12-31 and 01-01 in their records whereas for person_id = 14, we don't drop 12-31 because it has only 12-31 in its records.

仅当 12-31 12-31 $ c>和 01-01 出现在一个人的记录中。

We drop 12-31 only when both 12-31 and 01-01 appear in a person's records.

推荐答案

使用:

s = df2['dates'].dt.strftime('%m-%d')
m1 = s.eq('01-01').groupby(df2['person_id']).transform('any')
m2 = s.eq('12-31').groupby(df2['person_id']).transform('any')
m3 = np.select([m1 & m2, m1 | m2], [s.ne('12-31'), True], default=True)
df3 = df2[m3]

结果:

# print(df3)
    person_id    variable      dates
0          11  admit_date 2011-01-01
1          11  admit_date 2009-01-01
4          11  admit_date 2014-04-03
5          12  admit_date 2016-08-04
6          12  admit_date 2014-03-05
7          13  admit_date 2011-02-07
8          13  admit_date 2016-08-08
9          14  admit_date 2017-12-31
10         14  admit_date 2011-05-01
11         14  admit_date 2014-05-21
12         14  admit_date 2016-07-12

这篇关于根据条件使用 pandas 排除特定日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆