pandas :从3列创建时间戳:月，日，小时 [英] Pandas: create timestamp from 3 columns: Month, Day, Hour

查看：143 发布时间：2020/5/24 1:50:00 python datetime pandas

本文介绍了 pandas :从3列创建时间戳:月，日，小时的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用Python 2.7，panda 0.14.1-2，numpy 1.8.1-1.我必须使用Python 2.7，因为我将其与在Python 3上不起作用的东西结合在一起

I'm using Python 2.7, panda 0.14.1-2, numpy 1.8.1-1. I have to use Python 2.7 because I'm coupling it with something that doesn't work on Python 3

我正在尝试分析一个在单独的列中输出Month，Day和Hour的csv文件，其外观类似于以下内容:

I'm trying to analyze a csv files that outputs Month, Day and Hour in separate columns, and would look something like the following:

Month Day Hour Value 1 1 1 105 1 1 2 30 1 1 3 85 1 1 4 52 1 1 5 65

我基本上想从这些列中创建一个时间戳，并使用"2005"作为年份，并将这个新的时间戳列设置为索引. 我已经阅读了很多类似的问题(此处)，但它们都依赖于read_csv()期间的操作.我没有年份专栏，所以我认为这不适用于我(除了加载数据框，插入专栏，编写和重做read_csv ...似乎有些费解).

I basically want to create a timestamp from those columns, and use "2005" as the year, and set this new timestamp column to be the index. I've read a lot of similar questions (here and here) but they all rely on doing during read_csv(). I don't have a year column, so I don't think this applies to me (aside from loading dataframe, inserting column, writing, and redoing read_csv... seems convoluted).

加载数据框后，我在位置0插入Year列 df.insert(0，"Year"，2005)

After loading the dataframe, I insert a Year column in position 0 df.insert(0, "Year", 2005)

所以我现在有

Year Month Day Hour Value 2005 1 1 1 105 2005 1 1 2 30 2005 1 1 3 85 2005 1 1 4 52 2005 1 1 5 65 df.types告诉我所有列都是int64类型.

Year Month Day Hour Value 2005 1 1 1 105 2005 1 1 2 30 2005 1 1 3 85 2005 1 1 4 52 2005 1 1 5 65 df.types tells me that all columns are int64 types.

然后我尝试这样做:

df['Datetime'] = pd.to_datetime(df.Year*1000000 + df.Month*10000 + df.Day+100 + df.Hour, format="%Y%M%d%H")

但是我收到"TypeError:'long'对象无法切片"

But I'm getting "TypeError: 'long' object is unsliceable"

另一方面，以下命令运行无错误.

On the other hand, the following runs without errors.

df['Datetime'] = pd.to_datetime(df.Year*10000 + df.Month*100 + df.Day, format="%Y%M%d")

由于@EdChum指出2.7不喜欢％Y％M％d％H，因此我尝试分两个步骤进行操作:创建不带小时的日期时间，然后添加小时数.但是:输出不是我想要的

As 2.7 doesn't like the %Y%M%d%H, as pointed by @EdChum, I've tried doing it in two steps: creating a datetime without hours, and adding the hours after. But: the output is not what I wanted

In [1]: # Do it without hours first (otherwise doesn't work in Python 2.7)
df['Datetime'] = pd.to_datetime(df.Year*10000 + df.Month*100 + df.Day, format="%Y%M%d")

In [2]: df['Datetime']
Out [2]:
0    2005-01-01 00:01:00
1    2005-01-01 00:01:00
...
13   2005-01-01 00:01:00
14   2005-01-01 00:01:00
...
8745   2005-01-31 00:12:00
8746   2005-01-31 00:12:00
...
8758   2005-01-31 00:12:00
8759   2005-01-31 00:12:00

例如，

8758应该是2005年12月31日. 这有什么问题?

一旦我解决了这个问题，便可以重新添加小时数:

Once I resolve that, I'll be able to re-add the hours:

In [3]: # Then add the hours
df['Datetime'] = df['Datetime'] + pd.to_timedelta(df['Hour'], unit="h")

输出

                     Value
Month_Day_Hour            
2005-01-01 01:00:00    105
2005-01-01 02:00:00     30
2005-01-01 03:00:00     85
2005-01-01 04:00:00     52
2005-01-01 05:00:00     65

这篇关于 pandas :从3列创建时间戳:月，日，小时的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas :从3列创建时间戳:月，日，小时 [英] Pandas: create timestamp from 3 columns: Month, Day, Hour

问题描述

推荐答案

输出

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas :从3列创建时间戳:月，日，小时 [英] Pandas: create timestamp from 3 columns: Month, Day, Hour

问题描述

推荐答案

输出

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭