如何合并数据框的列以创建一个可用作日历的datetime列? [英] How do I combine columns of my dataframe to create one datetime column which I can use as my index?

查看:84
本文介绍了如何合并数据框的列以创建一个可用作日历的datetime列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Python Pandas进行数据分析.

I am using Python Pandas for data analysis.

我有一个取自excel文件的数据框,其中有6列描述时间戳(年,月,日,时,分,秒).我想创建一个pandas.datetime变量,但是当我使用pd.to_datetime()函数创建时,会发生以下情况:

I have a dataframe taken from an excel file with 6 columns describing the timestamp (year, month, day, hour, minute, second). I want to create a pandas.datetime variable but when I do so using the pd.to_datetime() function the following happens:

我的数据框(df):

jaar maand  dag uur minuten seconden
2005    7   1   0   0        0
2005    7   1   0   10       0
2005    7   1   0   20       0
2005    7   1   0   30       0
2005    7   1   0   40       0
2005    7   1   0   50       0

我做什么:

df['timestamp'] = pd.to_datetime(df['jaar'] + df['maand'] + df['dag'] + df['uur'] + df['minuten'] + df['seconden'])

但是我的df.['timestamp']系列中的项目将如下所示:

But then the items of my df.['timestamp'] series will look like this:

1970-01-01 00:00:00.20050701000000
1970-01-01 00:00:00.20050701001000
1970-01-01 00:00:00.20050701002000

合并日期的正确方法是什么,为什么此1970-01-01事件发生在我的日期时间?我无法手动设置自己的时间范围,因为这里和那里都缺少日期点.

What is the correct way to combine dates and why does this 1970-01-01 thing happen to my datetime? I can't set up my own time range manually because there are missing date points here and there.

我也尝试过:

我可以将它们组合起来以获得一行的时间戳,但是我有太多数据,以至于我无法使用循环来做到这一点.

I can combine them to get the timestamp of one row, but I have so much data that I just can't use loops to do this.

date00 = pd.datetime(df.iloc[0, 0], df.iloc[0, 1], df.iloc[0, 2], df.iloc[0, 3], df.iloc[0, 4], df.iloc[0, 5])

这是我第一次在这里发帖.我希望编辑可以.

This is my first time posting here. I hope the editing is okay.

推荐答案

看起来您具有int dtype,因此一种方法是使用apply并将所有列作为参数来构造datetime:

It looks you have int dtypes so one method would be to construct datetime using apply with all your columns as the params:

In [381]:
import pandas as pd
import datetime as dt
df.apply(lambda x: dt.datetime(x['jaar'], x['maand'], x['dag'], x['uur'], x['minuten'], x['seconden']), axis=1)

Out[381]:
0   2005-07-01 00:00:00
1   2005-07-01 00:10:00
2   2005-07-01 00:20:00
3   2005-07-01 00:30:00
4   2005-07-01 00:40:00
5   2005-07-01 00:50:00
dtype: datetime64[ns]

您可以通过直接覆盖将其设置为索引:

You can set this as the index by overwriting directly:

In [382]:
df.index = df.apply(lambda x: dt.datetime(x['jaar'], x['maand'], x['dag'], x['uur'], x['minuten'], x['seconden']), axis=1)
df

Out[382]:
                     jaar  maand  dag  uur  minuten  seconden
2005-07-01 00:00:00  2005      7    1    0        0         0
2005-07-01 00:10:00  2005      7    1    0       10         0
2005-07-01 00:20:00  2005      7    1    0       20         0
2005-07-01 00:30:00  2005      7    1    0       30         0
2005-07-01 00:40:00  2005      7    1    0       40         0
2005-07-01 00:50:00  2005      7    1    0       50         0

这篇关于如何合并数据框的列以创建一个可用作日历的datetime列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆