在 pandas 数据框中将不同的日期时间格式转换为MM/DD/YYYY格式 [英] Converting different date time formats to MM/DD/YYYY format in pandas dataframe

查看:114
本文介绍了在 pandas 数据框中将不同的日期时间格式转换为MM/DD/YYYY格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 pandas.DataFrame 以各种日期时间格式存储,并存储为列表对象,如下所示:

I have a date column in a pandas.DataFrame in various date time formats and stored as list object, like the following:

            date
1    [May 23rd, 2011]
2    [January 1st, 2010]
    ...
99   [Apr. 15, 2008]
100  [07-11-2013]
    ...
256  [9/01/1995]
257  [04/15/2000]
258  [11/22/68]
    ...
360  [12/1997]
361  [08/2002]
     ...
463  [2014]
464  [2016]

为方便起见,我想将它们全部转换为MM/DD/YYYY格式.似乎无法使用regex replace()函数来执行此操作,因为无法对列表对象执行此操作.另外,对每个单元使用strptime()会非常耗时.

For the sake of convenience, I want to convert them all to MM/DD/YYYY format. It doesn't seem possible to use regex replace() function to do this, since one cannot execute this operation over list objects. Also, to use strptime() for each cell will be too time-consuming.

将它们全部转换为所需的MM/DD/YYYY格式的更简单方法是什么?我发现很难对数据框内的列表对象执行此操作.

What will be the easier way to convert them all to the desired MM/DD/YYYY format? I found it very hard to do this on list objects within a dataframe.

注意:对于格式为[YYYY]的单元格值(例如[2014][2016]),我将假定它们是该年的第一天(即1968年1月1日),并且对于这样的单元格值作为[08/2002](或[8/2002]),我假设它们是当年那月的第一天(即2002年8月1日).

Note: for cell values of the form [YYYY] (e.g., [2014] and [2016]), I will assume they are the first day of that year (i.e., January 1, 1968) and for cell values such as [08/2002] (or [8/2002]), I will assume they the first day of the month of that year (i.e., August 1, 2002).

推荐答案

给出示例数据,并添加NaT,这样可以正常工作:

Given your sample data, with the addition of a NaT, this works:

df.date.apply(lambda x: pd.to_datetime(x).strftime('%m/%d/%Y')[0])

测试代码:

import pandas as pd

df = pd.DataFrame([
    [['']],
    [['May 23rd, 2011']],
    [['January 1st, 2010']],
    [['Apr. 15, 2008']],
    [['07-11-2013']],
    [['9/01/1995']],
    [['04/15/2000']],
    [['11/22/68']],
    [['12/1997']],
    [['08/2002']],
    [['2014']],
    [['2016']],
], columns=['date'])

df['clean_date'] = df.date.apply(
    lambda x: pd.to_datetime(x).strftime('%m/%d/%Y')[0])

print(df)

结果:

                   date  clean_date
0                    []         NaT
1      [May 23rd, 2011]  05/23/2011
2   [January 1st, 2010]  01/01/2010
3       [Apr. 15, 2008]  04/15/2008
4          [07-11-2013]  07/11/2013
5           [9/01/1995]  09/01/1995
6          [04/15/2000]  04/15/2000
7            [11/22/68]  11/22/1968
8             [12/1997]  12/01/1997
9             [08/2002]  08/01/2002
10               [2014]  01/01/2014
11               [2016]  01/01/2016

这篇关于在 pandas 数据框中将不同的日期时间格式转换为MM/DD/YYYY格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆