在Python中将不同的日期数据规范化为单一格式 [英] Normalize different date data into single format in Python
问题描述
我目前正在分析一个包含许多不同日期类型(例如
I am currently analyzing a dateset which contains so many different date types like
12/31/1991
December 10, 1980
September 25, 1970
2005-11-14
December 1990
October 12, 2005
1993-06-26
是否可以将所有日期数据标准化为单一格式"YYYY-MM-DD"?我熟悉Python中的datetime包,但是解决此问题的最佳方法是什么,以便它可以处理所有不同的日期类型.
Is there a way to normalize all the date data into single format 'YYYY-MM-DD' ? I am familiar with datetime package in Python, but what's the best way to approach this problem so that it can handle all the different date types.
推荐答案
如果可以使用库,则可以使用 datetime.datetime.strftime()
解析为格式为'YYYY-MM-DD'
的字符串.示例-
If you are okay with using a library, you can use the dateutil
library (I believe it comes already installed for Python 3 +) , specifically the dateutil.parser.parse
function, and parse all the dates into datetime objects, and then use datetime.datetime.strftime()
to parse them back into strings in the format - 'YYYY-MM-DD'
. Example -
>>> s = """12/31/1991
... December 10, 1980
... September 25, 1970
... 2005-11-14
... December 1990
... October 12, 2005
... 1993-06-26"""
>>> from dateutil import parser
>>> for i in s.splitlines():
... d = parser.parse(i)
... print(d.strftime("%Y-%m-%d"))
...
1991-12-31
1980-12-10
1970-09-25
2005-11-14
1990-12-10
2005-10-12
1993-06-26
需要注意的是, dateutil.parser.parse
如果字符串中缺少日期时间,则会使用当前日期时间来弥补日期时间的任何部分(如上图所示)'1990年12月'
的日期,它被解析为- 1990-12-10
,因为 10
是当前日期).
A thing to note, dateutil.parser.parse
would use the current datetime to make up for any parts of the datetime if they are missing in the string (as can be seen above in the parsing of 'December 1990'
, which got parsed as - 1990-12-10
as 10
is the current date) .
这篇关于在Python中将不同的日期数据规范化为单一格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!