通过日期列过滤 pandas 中的数据框 [英] filter dataframe in pandas by a date column

查看:63
本文介绍了通过日期列过滤 pandas 中的数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

数据在以下链接中: http://www.fdic .gov / bank / individual / failed / banklist.html

我只想要2017年关门的银行。如何在Pandas中做到这一点?

I want only the banks which closed in 2017. How can I do it in Pandas ?

failed_banks= pd.read_html('http://www.fdic.gov/bank/individual/failed/banklist.html')
failed_banks[0]

在经过以下几行代码提取后,我该怎么办

What should I do after these lines of code to extract the desired result?

推荐答案

理想情况下,您会使用

# assuming pandas successfully parsed this column as datetime object
# and pandas version >= 0.16
failed_banks= pd.read_html('http://www.fdic.gov/bank/individual/failed/banklist.html')[0]
failed_banks = failed_banks[failed_banks['Closing Date'].dt.year == 2017]

但是熊猫不能正确解析结算日期作为日期对象,因此我们需要自己解析:

But pandas doesn't correctly parses the Closing Date as date objects, so we need to parse it ourselves:

failed_banks = pd.read_html('http://www.fdic.gov/bank/individual/failed/banklist.html')[0]

def parse_date_strings(date_str):
    return int(date_str.split(', ')[-1]) == 2017

failed_banks = failed_banks[failed_banks['Closing Date'].apply(parse_date_strings)]

这篇关于通过日期列过滤 pandas 中的数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆