如何解析不同的字符串日期格式? [英] How to parse different string date formats?

查看:130
本文介绍了如何解析不同的字符串日期格式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用混合了不同字符串的表进行工作,从而可以得出日期.

Working on a table with a mix of different strings where it's possible to derive a date.

    period
0   Q2 '20 Base
1   Q3 '20 Base
2   Q1 '21 Base
3   February '20 Base
4   March '20 Peak
5   Summer 22 Base
6   Winter 20 Peak
7   Summer 21 Base
8   Year 2021
9   October '21 Peak

我希望能够将其解析为时间戳,以便在python中进行分析.首先,理想情况下,我想解析为4个新列:1)天2)月3)季度4)年.然后使用这些列创建日期时间(DD-MM-YYYY).

I'd like to be able to parse this into a timestamp for analysis in python. First, ideally I want to parse into 4 new columns 1) day 2) month 3) quarter 4) year. Then use these columns to make a datetime (DD-MM-YYYY).

    period             day  month quarter year
0   Q2 '20 Base         01  04    1       2020
1   Q3 '20 Peak         01  07    3       2020
2   Q1 '21 Base         01  01    1       2021
3   February '20 Base   01  02    1       2020
4   March '20 Peak      01  03    1       2020
5   Summer 22 Base      01  04    2       2022
6   Winter 20 Peak      01  10    4       2020
7   Summer 21 Base      01  04    2       2021
8   Year 2021           01  01    1       2021
9   October '21 Base    01  10    4       2021

如何将其解析为4个新列?

How can I parse this into the 4 new columns?

推荐答案

我的想法是为您的标识符设置字典数据结构,如下所示:

My idea is to set up a dictionary data structure for your identifiers like this:

datemap = { 'January' :  {'day' : 1, 'month' : 1, 'quarter' : 1}, 
            'February' : {'day' : 1, 'month' : 2, 'quarter' : 1}, 
            'March' :    {'day' : 1, 'month' : 3, 'quarter' : 1}, 
            # and so on ...
            'Spring' : {'day' : 1, 'month' : 1, 'quarter' : 1}, 
            'Summer' : {'day' : 1, 'month' : 4, 'quarter' : 2}, 
            'Fall' :   {'day' : 1, 'month' : 7, 'quarter' : 3}, 
            'Winter' : {'day' : 1, 'month' : 10, 'quarter' : 4}, 
            'Q1' : {'day' : 1, 'month' : 1, 'quarter' : 1}, 
            'Q2' : {'day' : 1, 'month' : 4, 'quarter' : 2}, 
            'Q3' : {'day' : 1, 'month' : 7, 'quarter' : 3}, 
            'Q4' : {'day' : 1, 'month' : 10, 'quarter' : 4}, 
            'Year' : {'day' : 1, 'month' : 1, 'quarter' : 1} }

然后,您可以通过查看第一个单词r['period'].split()[0](或年份的第二个单词)来转换给定值r['period'],如下所示:

Then you can transform a given value r['period'] by looking at the first word r['period'].split()[0] (or second word for the year) like this:

df['day'] = df.apply (lambda r: datemap[r['period'].split()[0]]['day'], axis=1)
df['month'] = df.apply (lambda r: datemap[r['period'].split()[0]]['month'], axis=1)
df['quarter'] = df.apply (lambda r: datemap[r['period'].split()[0]]['quarter'], axis=1)
df['year'] = df.apply (lambda r: "20" + r['period'].split()[1][-2:], axis=1)

这篇关于如何解析不同的字符串日期格式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆