pandas :将日期时间转换为月末 [英] pandas: convert datetime to end-of-month

查看:146
本文介绍了 pandas :将日期时间转换为月末的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经编写了将熊猫的datetime日期转换为月末的功能:

I have written a function to convert pandas datetime dates to month-end:

import pandas
import numpy
import datetime
from pandas.tseries.offsets import Day, MonthEnd

def get_month_end(d):
    month_end = d - Day() + MonthEnd() 
    if month_end.month == d.month:
        return month_end # 31/March + MonthEnd() returns 30/April
    else:
        print "Something went wrong while converting dates to EOM: " + d + " was converted to " + month_end
        raise

此功能似乎很慢,我想知道是否还有其他更快的选择?我注意到它很慢的原因是,我在具有50'000个日期的dataframe列上运行此代码,并且我发现自引入该函数以来,代码运行得慢得多(在将日期转换为月末之前).

This function seems to be quite slow, and I was wondering if there is any faster alternative? The reason I noticed it's slow is that I am running this on a dataframe column with 50'000 dates, and I can see that the code is much slower since introducing that function (before I was converting dates to end-of-month).

df = pandas.read_csv(inpath, na_values = nas, converters = {open_date: read_as_date})
df[open_date] = df[open_date].apply(get_month_end)

我不确定这是否有意义,但我在阅读以下日期:

I am not sure if that's relevant, but I am reading the dates in as follows:

def read_as_date(x):
    return datetime.datetime.strptime(x, fmt)

推荐答案

修改后,转换为句点然后再返回时间戳即可.

Revised, converting to period and then back to timestamp does the trick

In [104]: df = DataFrame(dict(date = [Timestamp('20130101'),Timestamp('20130131'),Timestamp('20130331'),Timestamp('20130330')],value=randn(4))).set_index('date')

In [105]: df
Out[105]: 
               value
date                
2013-01-01 -0.346980
2013-01-31  1.954909
2013-03-31 -0.505037
2013-03-30  2.545073

In [106]: df.index = df.index.to_period('M').to_timestamp('M')

In [107]: df
Out[107]: 
               value
2013-01-31 -0.346980
2013-01-31  1.954909
2013-03-31 -0.505037
2013-03-31  2.545073

请注意,这种类型的转换也可以这样完成,不过上面的操作会稍快一些.

Note that this type of conversion can also be done like this, the above would be slightly faster, though.

In [85]: df.index + pd.offsets.MonthEnd(0) 
Out[85]: DatetimeIndex(['2013-01-31', '2013-01-31', '2013-03-31', '2013-03-31'], dtype='datetime64[ns]', name=u'date', freq=None, tz=None)

这篇关于 pandas :将日期时间转换为月末的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆