在带有日期的几列上具有多个条件的子集 pandas 数据框 [英] Subset pandas data frame with multiple condition on several columns with date
本文介绍了在带有日期的几列上具有多个条件的子集 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框,如下所示:
I have a data frame as follows:
slot_id class day base_date
0 1 A Monday 2019-01-21
1 2 B Tuesday 2019-01-22
2 3 C Wednesday 2019-01-23
3 4 C Wednesday 2019-01-23
4 5 C Thursday 2019-01-24
具有以下信息:
example.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8 entries, 0 to 7
Data columns (total 4 columns):
slot_id 8 non-null int64
class 8 non-null object
day 8 non-null object
base_date 8 non-null object
dtypes: int64(1), object(3)
memory usage: 200.0+ bytes
我想获得具有最小base_date
的class == "C"
的行.我已经尝试了许多min
组合,但都没有成功,就像这样:
I would like to get the row with class == "C"
with the minimum base_date
. I have tried many combinations of min
without success, like this one:
example[(example['class']=="C") & min(example['base_date'])]
上面的代码有什么问题?如何获得满足以上条件的完整行?
完整的example
数据帧:
What's wrong with the code above? How could I get a full row meeting the condition above?
Full example
data frame:
{'slot_id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5, 5: 6, 6: 7, 7: 8}, 'class': {0: 'A', 1: 'B', 2: 'C', 3: 'C', 4: 'C', 5: 'C', 6: 'D', 7: 'E'}, 'day': {0: 'Monday', 1: 'Tuesday', 2: 'Wednesday', 3: 'Wednesday', 4: 'Thursday', 5: 'Tuesday', 6: 'Thursday', 7: 'Saturday'}, 'base_date': {0: datetime.date(2019, 1, 21), 1: datetime.date(2019, 1, 22), 2: datetime.date(2019, 1, 23), 3: datetime.date(2019, 1, 23), 4: datetime.date(2019, 1, 24), 5: datetime.date(2019, 1, 22), 6: datetime.date(2019, 1, 24), 7: datetime.date(2019, 1, 26)}}
example['base_date'] = pd.to_datetime(example['base_date'].astype(str), format='%d%m%Y')
example['base_date'] = example['base_date'].dt.date
推荐答案
您可以使用idxmin
:
pd.to_datetime(df.loc[df['class'] == 'C', 'base_date']).idxmin()
# 2
df.iloc[pd.to_datetime(df.loc[df['class'] == 'C', 'base_date']).idxmin()]
slot_id 3
class C
day Wednesday
base_date 2019-01-23
Name: 2, dtype: object
如果您需要重复执行此操作,更好的解决方案是将"base_date"预先转换为datetime
类型:
If you need to do this repeatedly, a better solution is to pre-convert "base_date" to datetime
type:
df['base_date'] = pd.to_datetime(df['base_date'], errors='coerce')
df.iloc[df.loc[df['class'] == 'C', 'base_date'].idxmin()]
slot_id 3
class C
day Wednesday
base_date 2019-01-23 00:00:00
Name: 2, dtype: object
这篇关于在带有日期的几列上具有多个条件的子集 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文