将Pandas Multi-Index转换为Pandas时间戳 [英] Convert pandas multi-index to pandas timestamp

查看:208
本文介绍了将Pandas Multi-Index转换为Pandas时间戳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将未堆叠多索引数据框转换回大熊猫日期时间索引.

I'm trying to convert an unstacked, multi-indexed data-frame back to a single pandas datetime index.

我的原始数据框的索引,即在进行多索引和拆栈之前,如下所示:

The index of my original data-frame, i.e. before multi-indexing and unstacking, looks like this:

In [1]: df1_season.index
Out [1]: 

<class 'pandas.tseries.index.DatetimeIndex'>
[2013-05-01 02:00:00, ..., 2014-07-31 23:00:00]
Length: 1472, Freq: None, Timezone: None

然后我应用了多索引和拆栈,因此我可以像这样将年度数据相互绘制在一起:

then I apply the multi-indexing and unstacking so I can plot the yearly data on top of each other like this:

df_sort = df1_season.groupby(lambda x: (x.year, x.month, x.day, x.hour)).agg(lambda s: s[-1])
df_sort.index = pd.MultiIndex.from_tuples(df_sort.index, names=['Y','M','D','H'])
unstacked = df_sort.unstack('Y')

5月的前两天我的数据框架如下:

My new data-frame for the first two days of May looks like this:

In [2]: unstacked
Out [2]:

          temp        season        
Y        2013  2014    2013    2014
M D  H                             
5 1  2   24.2  22.3  Summer  Summer
     8   24.1  22.3  Summer  Summer
     14  24.3  23.2  Summer  Summer
     20  24.6  23.2  Summer  Summer
  2  2   24.2  22.5  Summer  Summer
     8   24.8  22.2  Summer  Summer
     14  24.9  22.4  Summer  Summer
     20  24.9  22.8  Summer  Summer

736 rows × 4 columns 

上面显示的数据框的索引现在看起来像这样:

The index for the new data frame shown above now looks like this:

In [2]: unstacked.index.values[0:8]
Out [2]:

array([(5, 1, 2), (5, 1, 8), (5, 1, 14), (5, 1, 20), (5, 2, 2), (5, 2, 8), (5, 2, 14), 
       (5, 2, 20], dtype=object)

相对于xticks(主要和次要),这不会产生非常好的情节.如果我仅使用月,日和小时数据就可以将多索引转换回单个大熊猫日期时间索引,那么主要/次要的滴答声将自动绘制成我想要的方式(我认为).例如:

which doesn't produce a very nice plot with respect to the xticks (major and minor). If I can convert this multi-index back to a single pandas datetime index, using only the month, day and hour data, then the major/minor ticks will be plotted automagically the way I would like (I think). For example:

当前解决方案:

xticks = (5, 1, 2), (5, 1, 8) … (5, 2, 20)

必需的解决方案:

xticks(major) = Day, Month (displayed as MAY 01, MAY 02 etc etc)
xticks(minor) = Hour (displayed as 02h 08h … 20h)

推荐答案

在熊猫中来回转换数据变得非常麻烦,就像您所经历的那样. 对于大熊猫和索引,我的一般建议是永远不要只设置索引,而要先复制它.请确保您有一列包含索引的列,因为pandas不允许对索引进行所有操作,并且强烈设置和重置索引可能会导致列消失.

Converting data back and forth in pandas gets messy very fast, as you seem to have experienced. My recommendation in general concerning pandas and indexing, is to never just set the index, but to copy it first. Make sure you have a column which contains the index, since pandas does not allow all operations on the index, and intense setting and resetting of the index can cause columns to dissapear.

TLDR; 不要将索引转换回来.保留一份副本.

TLDR; Don't convert the index back. Keep a copy.

这篇关于将Pandas Multi-Index转换为Pandas时间戳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆