将datetime字符串转换为pandas数据框中的Day,Month,Year的新列 [英] Convert datetime string to new columns of Day, Month, Year in pandas data frame

查看:106
本文介绍了将datetime字符串转换为pandas数据框中的Day,Month,Year的新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是python的新手,有一个非常简单(希望很简单!)的问题.

I am new to python and have a pretty simple (hopefully straightforward!) question.

假设我有一个包含3列的数据框:时间(格式为YYYY-MM-DDTHH:MM:SSZ),device_id和rain,但我需要第一列时间"成为三列天",月"和年"的列,其中包含来自时间戳记的值.

Say that I have a data frame with 3 columns: time (which is in the format YYYY-MM-DDTHH:MM:SSZ), device_id, and rain but I need the first column, "time", to become three columns of "day", "month", and "year" with values from the timestamp.

所以原始数据框看起来像这样:

So the original data frame looks something like this:

     time                  device_id                              rain
     2016-12-27T00:00:00Z  9b839362-b06d-4217-96f5-f261c1ada8d6   NaN
     2016-12-28T00:00:00Z  9b839362-b06d-4217-96f5-f261c1ada8d6   0.2
     2016-12-29T00:00:00Z  9b839362-b06d-4217-96f5-f261c1ada8d6   NaN
     2016-12-30T00:00:00Z  9b839362-b06d-4217-96f5-f261c1ada8d6   NaN
     2016-12-31T00:00:00Z  9b839362-b06d-4217-96f5-f261c1ada8d6   NaN

但是我试图使数据框看起来像这样:

But I'm trying to get the data frame to look like this:

     day  month  year  device_id                              rain
     27   12     2016  9b839362-b06d-4217-96f5-f261c1ada8d6   NaN
     28   12     2016  9b839362-b06d-4217-96f5-f261c1ada8d6   0.2
     29   12     2016  9b839362-b06d-4217-96f5-f261c1ada8d6   NaN
     30   12     2016  9b839362-b06d-4217-96f5-f261c1ada8d6   NaN
     31   12     2016  9b839362-b06d-4217-96f5-f261c1ada8d6   NaN

我不在乎小时/秒/分钟,但是需要原始时间戳中的这些值,而且我什至都不知道从哪里开始.请帮忙!

I don't care about the hour/seconds/minutes but need these values from the original time stamp, and I don't even know where to start. Please help!

以下是一些可重现的代码,供您入门:

Here's some reproducible code to get started:

>> import pandas as pd 
>> df = pd.DataFrame([['2016-12-27T00:00:00Z', '9b839362-b06d-4217-96f5-f261c1ada8d6', 'NaN']], columns=['time', 'device_id', 'rain'])
>> print df
2016-12-27T00:00:00Z  9b849362-b06d-4217-96f5-f261c1ada8d6  NaN

推荐答案

只需用-T分隔时间,前三个元素应对应于年,月和日列,并将其与其他列连接两列将满足您的需求:

Just split the time with - or T and the first three elements should correspond to the year, month and day column, concatenate it with the other two columns will get what you need:

pd.concat([df.drop('time', axis = 1), 
          (df.time.str.split("-|T").str[:3].apply(pd.Series)
          .rename(columns={0:'year', 1:'month', 2:'day'}))], axis = 1)

一种类似于@nlassaux的方法的替代方法是:

An alternative close to @nlassaux's approach would be:

df['time'] = pd.to_datetime(df['time'])   
df['year'] = df.time.dt.year
df['month'] = df.time.dt.month
df['day'] = df.time.dt.day
df.drop('time', axis=1, inplace=True)

这篇关于将datetime字符串转换为pandas数据框中的Day,Month,Year的新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆