在Pandas数据框中将字符串日期转换为其他格式 [英] Convert string date to a different format in pandas dataframe

查看:67
本文介绍了在Pandas数据框中将字符串日期转换为其他格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

到目前为止,我一直在社区中寻找这个答案,

I have been looking for this answer in the community so far, could not have.

我在python 3.5.1中有一个数据框,其中包含带有日期的列

I have a dataframe in python 3.5.1 that contains a column with dates in string imported from a CSV file.

数据框看起来像这样

                  TimeStamp  TBD  TBD     Value  TBD
0       2016/06/08 17:19:53  NaN  NaN  0.062942  NaN
1       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN
2       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN

我需要的是更改TimeStamp列格式为
%m /%d /%y%H:%M:%D

what I need is to change the TimeStamp column format to be %m/%d/%y %H:%M:%D

                  TimeStamp  TBD  TBD     Value  TBD
0       06/08/2016 17:19:53  NaN  NaN  0.062942  NaN

到目前为止,我已经找到了一些适用于字符串而非序列的解决方案

So far I have found some solutions that works but for string and not for series

任何帮助将不胜感激

谢谢

推荐答案

将字符串列转换为时间序列,可以使用 dt.strftime 方法

If you convert the column of strings to a time series, you could use the dt.strftime method:

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')
print(df)

收益

   TBD  TBD.1  TBD.2            TimeStamp     Value
0  NaN    NaN    NaN  06/08/2016 17:19:53  0.062942
1  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942
2  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942






由于要将一列字符串转换为另一(不同的)字符串列,因此也可以使用向量化的 str.replace 方法:

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
print(df)

因为

In [32]: df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'\2/\3/\1\4')
Out[32]: 
0    06/08/2016 17:19:53
1    06/08/2016 17:19:54
2    06/08/2016 17:19:54
Name: TimeStamp, dtype: object

这使用正则表达式重新排列了strin g ,而无需先将
字符串解析为日期
。这比第一种方法要快(主要是因为它跳过了
的解析步骤),但是它也具有不检查
日期字符串是否为有效日期的缺点。

This uses regex to rearrange pieces of the string without first parsing the string as a date. This is faster than the first method (mainly because it skips the parsing step), but it also has the disadvantage of not checking that the date strings are valid dates.

这篇关于在Pandas数据框中将字符串日期转换为其他格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆