通过日期列过滤 pandas 中的数据框 [英] filter dataframe in pandas by a date column
本文介绍了通过日期列过滤 pandas 中的数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
数据在以下链接中: http://www.fdic .gov / bank / individual / failed / banklist.html
我只想要2017年关门的银行。如何在Pandas中做到这一点?
I want only the banks which closed in 2017. How can I do it in Pandas ?
failed_banks= pd.read_html('http://www.fdic.gov/bank/individual/failed/banklist.html')
failed_banks[0]
在经过以下几行代码提取后,我该怎么办
What should I do after these lines of code to extract the desired result?
推荐答案
理想情况下,您会使用
# assuming pandas successfully parsed this column as datetime object
# and pandas version >= 0.16
failed_banks= pd.read_html('http://www.fdic.gov/bank/individual/failed/banklist.html')[0]
failed_banks = failed_banks[failed_banks['Closing Date'].dt.year == 2017]
但是熊猫不能正确解析结算日期
作为日期对象,因此我们需要自己解析:
But pandas doesn't correctly parses the Closing Date
as date objects, so we need to parse it ourselves:
failed_banks = pd.read_html('http://www.fdic.gov/bank/individual/failed/banklist.html')[0]
def parse_date_strings(date_str):
return int(date_str.split(', ')[-1]) == 2017
failed_banks = failed_banks[failed_banks['Closing Date'].apply(parse_date_strings)]
这篇关于通过日期列过滤 pandas 中的数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文