pandas date_range-减去numpy timedelta得出奇怪的结果,时间不是0:00:00 [英] Pandas date_range - subtracting numpy timedelta gives odd result, time becomes not 0:00:00
问题描述
我正在尝试使用pandas date_range功能生成一组日期.然后,我要在此范围内进行迭代,并从每个日期中减去几个月(确切的月数由循环确定)以获取新日期.
当我这样做时,会得到一些非常奇怪的结果.
MVP:
#get date range
dates = pd.date_range(start = '1/1/2013', end='1/1/2018', freq=str(test_size)+'MS', closed='left', normalize=True)
#take first date as example
date = dates[0]
date
Timestamp('2013-01-01 00:00:00', freq='3MS')
到目前为止一切都很好.
现在让我们说我想从这个日期起只剩一个月了.我定义了numpy timedelta(它支持数月的定义,而熊猫的timedelta不支持):
#get timedelta of 1 month
deltaGap = np.timedelta64(1,'M')
#subtract one month from date
date - deltaGap
Timestamp('2012-12-01 13:30:54', freq='3MS')
为什么呢?为什么我在时间部分而不是午夜得到13:30:54.
此外,如果我减去1个月以上,则变化会变得很大,以至于我整天失去了力量:
#let's say I want to subtract both 2 years and then 1 month
deltaTrain = np.timedelta64(2,'Y')
#subtract 2 years and then subtract 1 month
date - deltaTrain - deltaGap
Timestamp('2010-12-02 01:52:30', freq='3MS')
我在timedelta
上也遇到了类似的问题,而我最终使用的解决方案是使用dateutil
中的relativedelta
,具体是为此类应用程序而构建(考虑到所有日历怪异性,例如leap年,工作日等).例如:
from dateutil.relativedelta import relativedelta
date = dates[0]
>>> date
Timestamp('2013-01-01 00:00:00', freq='10MS')
deltaGap = relativedelta(months=1)
>>> date-deltaGap
Timestamp('2012-12-01 00:00:00', freq='10MS')
deltaGap = relativedelta(years=2, months=1)
>>> date-deltaGap
Timestamp('2010-12-01 00:00:00', freq='10MS')
查看文档以获得有关relativedelta
的更多信息>
numpy.timedelta64
我认为 和 跨度的长度是64位整数乘以日期或单位长度的范围.例如,"W"(周)的时间跨度比"D"(天)的时间跨度长7倍,而"D"(天)的时间跨度比时间跨度长24倍.表示"h"(小时). 因此,时间增量适用于数小时,数周,数月,数天,因为这是不可更改的时间跨度.但是,月份和年份的长度是可变的(请考虑leap年),因此,考虑到这一点, 使用它并不是那么容易,这就是为什么我只想去 I am trying to generate a set of dates with pandas date_range functionality. Then I want to iterate over this range and subtract several months from each of the dates (exact number of month is determined in loop) to get a new date. I get some very odd results when I do this. MVP: So far so good. Now let's say I want to go just one month back from this date. I define numpy timedelta (it supports months for definition, while pandas' timedelta doesn't): Why so? Why I get 13:30:54 in time component instead of midnight. Moreover, if I subtract more than 1 month it the shift becomes so large that I lose a whole day:
I've had similar issues with Check out the documentation for more info on The issues with I think that the problem with There are two Timedelta units (‘Y’, years and ‘M’, months) which are treated specially, because how much time they represent changes depending on when they are used. While a timedelta day unit is equivalent to 24 hours, there is no way to convert a month unit into days, because different months have different numbers of days. and The length of the span is the range of a 64-bit integer times the length of the date or unit. For example, the time span for ‘W’ (week) is exactly 7 times longer than the time span for ‘D’ (day), and the time span for ‘D’ (day) is exactly 24 times longer than the time span for ‘h’ (hour). So the timedeltas are fine for hours, weeks, months, days, because these are non-variable timespans. However, months and years are variable in length (think leap years), and so to take this into account, This is not so easy to work with, which is why I would just go to 这篇关于 pandas date_range-减去numpy timedelta得出奇怪的结果,时间不是0:00:00的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!np.timedelta
的问题已在
numpy
需要某种平均值"(我想).一个numpy
年"似乎是一年5小时49分12秒,而一个numpy
月"似乎是30天10小时29分6秒. # Adding one numpy month adds 30 days + 10:29:06:
deltaGap = np.timedelta64(1,'M')
date+deltaGap
# Timestamp('2013-01-31 10:29:06', freq='10MS')
# Adding one numpy year adds 1 year + 05:49:12:
deltaGap = np.timedelta64(1,'Y')
date+deltaGap
# Timestamp('2014-01-01 05:49:12', freq='10MS')
relativedelta
的原因(对我来说,它更直观).#get date range
dates = pd.date_range(start = '1/1/2013', end='1/1/2018', freq=str(test_size)+'MS', closed='left', normalize=True)
#take first date as example
date = dates[0]
date
Timestamp('2013-01-01 00:00:00', freq='3MS')
#get timedelta of 1 month
deltaGap = np.timedelta64(1,'M')
#subtract one month from date
date - deltaGap
Timestamp('2012-12-01 13:30:54', freq='3MS')
#let's say I want to subtract both 2 years and then 1 month
deltaTrain = np.timedelta64(2,'Y')
#subtract 2 years and then subtract 1 month
date - deltaTrain - deltaGap
Timestamp('2010-12-02 01:52:30', freq='3MS')
timedelta
, and the solution I've ended up using was using relativedelta
from dateutil
, which is specifically built for this kind of application (taking into account all the calendar weirdness like leap years, weekdays, etc...). For example given:from dateutil.relativedelta import relativedelta
date = dates[0]
>>> date
Timestamp('2013-01-01 00:00:00', freq='10MS')
deltaGap = relativedelta(months=1)
>>> date-deltaGap
Timestamp('2012-12-01 00:00:00', freq='10MS')
deltaGap = relativedelta(years=2, months=1)
>>> date-deltaGap
Timestamp('2010-12-01 00:00:00', freq='10MS')
relativedelta
numpy.timedelta64
np.timedelta
is revealed in these 2 parts of the docs:
numpy
takes some sort of "average" (I guess). One numpy
"year" seems to be one year, 5 hours, 49 minutes and 12 seconds, while one numpy
"month" seems to be 30 days, 10 hours, 29 minutes and 6 seconds. # Adding one numpy month adds 30 days + 10:29:06:
deltaGap = np.timedelta64(1,'M')
date+deltaGap
# Timestamp('2013-01-31 10:29:06', freq='10MS')
# Adding one numpy year adds 1 year + 05:49:12:
deltaGap = np.timedelta64(1,'Y')
date+deltaGap
# Timestamp('2014-01-01 05:49:12', freq='10MS')
relativedelta
, which is much more intuitive (to me).