在 pandas 的datetime列中添加月份 [英] Add months to a datetime column in pandas

查看:136
本文介绍了在 pandas 的datetime列中添加月份的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含2列的数据框df,如下所示-

I have a dataframe df with 2 columns as below -

               START_DATE             MONTHS
0              2015-03-21                240
1              2015-03-21                240
2              2015-03-21                240
3              2015-03-21                240
4              2015-03-21                240
5              2015-01-01                120
6              2017-01-01                240
7                     NaN                NaN
8                     NaN                NaN
9                     NaN                NaN

两列的数据类型是对象.

The datatypes of the 2 columns are objects.

>>> df.dtypes
START_DATE    object
MONTHS        object
dtype: object

现在,我想通过添加df ['START_DATE']&创建一个新列"Result". df ['MONTHS'].因此,我已完成以下操作-

Now, I want to create a new column "Result" by adding df['START_DATE'] & df['MONTHS']. So, I have done the below -

from dateutil.relativedelta import relativedelta  

df['START_DATE'] = pd.to_datetime(df['START_DATE'])
df['MONTHS'] = df['MONTHS'].astype(float)

df['offset'] = df['MONTHS'].apply(lambda x: relativedelta(months=x))

df['Result'] = df['START_DATE'] + df['offset'] 

在这里,我得到以下错误-

Here, I get the below error -

TypeError: incompatible type [object] for a datetime/timedelta operation

注意:想要将df ['Months']转换为int,但由于该字段为Null而无法正常工作.

Note: Wanted to convert df['Months'] to int but wouldn't work as the field had Nulls.

能给我一些指示吗,谢谢.

Can you please give me some directions.Thanks.

推荐答案

这是一种矢量化的方法,因此应具有较高的性能.请注意,它不能处理月份的穿越/结束(并且不能很好地处理DST的变化.我相信这就是您得到时间的原因).

This is a vectorized way to do this, so should be quite performant. Note that it doesn't handle month crossings / endings (and doesn't deal well with DST changes. I believe that's why you get the times).

In [32]: df['START_DATE'] + df['MONTHS'].values.astype("timedelta64[M]")
Out[32]: 
0   2035-03-20 20:24:00
1   2035-03-20 20:24:00
2   2035-03-20 20:24:00
3   2035-03-20 20:24:00
4   2035-03-20 20:24:00
5   2024-12-31 10:12:00
6   2036-12-31 20:24:00
7                   NaT
8                   NaT
9                   NaT
Name: START_DATE, dtype: datetime64[ns]

如果您需要精确的MonthEnd/Begin处理,这是一种适当的方法. (使用MonthsOffset获取同一天)

If you need exact MonthEnd/Begin handling, this is an appropriate method. (Use MonthsOffset to get the same day)

In [33]: df.dropna().apply(lambda x: x['START_DATE'] + pd.offsets.MonthEnd(x['MONTHS']), axis=1)
Out[33]: 
0   2035-02-28
1   2035-02-28
2   2035-02-28
3   2035-02-28
4   2035-02-28
5   2024-12-31
6   2036-12-31
dtype: datetime64[ns]

这篇关于在 pandas 的datetime列中添加月份的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆