pandas -将数据框多索引转换为日期时间对象 [英] Pandas - convert dataframe multi-index to datetime object
本文介绍了 pandas -将数据框多索引转换为日期时间对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
考虑输入文件b.dat
:
string,date,number
a string,2/5/11 9:16am,1.0
a string,3/5/11 10:44pm,2.0
a string,4/22/11 12:07pm,3.0
a string,4/22/11 12:10pm,4.0
a string,4/29/11 11:59am,1.0
a string,5/2/11 1:41pm,2.0
a string,5/2/11 2:02pm,3.0
a string,5/2/11 2:56pm,4.0
a string,5/2/11 3:00pm,5.0
a string,5/2/14 3:02pm,6.0
a string,5/2/14 3:18pm,7.0
我可以像这样对每月总计进行分组:
I can group monthly totals like so:
b=pd.read_csv('b.dat')
b['date']=pd.to_datetime(b['date'],format='%m/%d/%y %I:%M%p')
b.index=b['date']
bg=pd.groupby(b,by=[b.index.year,b.index.month])
bgs=bg.sum()
分组总数的索引如下:
bgs
number
2011 2 1
3 2
4 8
5 14
2014 5 13
bgs.index
MultiIndex(levels=[[2011, 2014], [2, 3, 4, 5]],
labels=[[0, 0, 0, 0, 1], [0, 1, 2, 3, 3]])
我想将索引重新格式化为日期时间格式(天可以是一个月的第一天).
I'd like to reformat the index into date time format (days can be first of month).
我尝试了以下操作:
bgs.index = pd.to_datetime(bgs.index)
和
bgs.index = pd.DatetimeIndex(bgs.index)
均失败.有人知道我该怎么做吗?
Both fail. Does anyone know how I can do this?
推荐答案
考虑通过"M"进行重采样,而不是按照DatetimeIndex的属性进行分组:
Consider resample by 'M' rather than grouping by attributes of the DatetimeIndex:
In [11]: b.resample('M', how='sum').dropna()
Out[11]:
number
date
2011-02-28 1
2011-03-31 2
2011-04-30 8
2011-05-31 14
2014-05-31 13
注意:如果您不想在两个月之间输入月份,则必须删除NaN.
Note: you have to drop the NaN if you don't want the months in between.
这篇关于 pandas -将数据框多索引转换为日期时间对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文