获取多个日期时间对的日期范围 [英] Getting date ranges for multiple datetime pairs
问题描述
给出形状为(n, 2)
的日期时间数组:
Given a datetime array of the shape (n, 2)
:
x = np.array([['2017-10-02T00:00:00.000000000', '2017-10-12T00:00:00.000000000']], dtype='datetime64[ns]')
x
的形状为(1, 2)
,但实际上可以是(n, 2)
,n >= 1
.在每对中,第一个日期始终小于(或等于)第二个日期.我想获取x
中每对日期之间所有日期范围的列表.这基本上是我正在做的:
x
has shape (1, 2)
, but in reality it could be (n, 2)
, n >= 1
. In each pair, the first date is always smaller than (or equal to) the second. I want to get a list of all date ranges between each pair of dates in x
. This is what I'm doing basically:
np.concatenate([pd.date_range(*y, closed='right') for y in x])
它奏效了,给予
array(['2017-10-03T00:00:00.000000000', '2017-10-04T00:00:00.000000000',
'2017-10-05T00:00:00.000000000', '2017-10-06T00:00:00.000000000',
'2017-10-07T00:00:00.000000000', '2017-10-08T00:00:00.000000000',
'2017-10-09T00:00:00.000000000', '2017-10-10T00:00:00.000000000',
'2017-10-11T00:00:00.000000000', '2017-10-12T00:00:00.000000000'], dtype='datetime64[ns]')
但是由于list comp的缘故,这非常慢-并非完全按照我的意愿进行向量化.我想知道是否有更好的方法来获取多对日期的日期范围?
But this is pretty slow because of the list comp - it isn't exactly vectorised as I'd like. I'm wondering if there's a better way to obtain date ranges for multiple pairs of dates?
我将根据需要提供尽可能多的说明.谢谢.
I'll provide as much clarification as needed. Thanks.
推荐答案
这有点令人费解……
但是
It's a tad convoluted...
But
d = np.array(1, dtype='timedelta64[D]')
x = x.astype('datetime64[D]')
deltas = np.diff(x, axis=1) / d
np.concatenate([
i + np.arange(j + 1) for i, j in zip(x[:, 0], deltas[:, 0].astype(int))
]).astype('datetime64[ns]')
array(['2017-10-02T00:00:00.000000000', '2017-10-03T00:00:00.000000000',
'2017-10-04T00:00:00.000000000', '2017-10-05T00:00:00.000000000',
'2017-10-06T00:00:00.000000000', '2017-10-07T00:00:00.000000000',
'2017-10-08T00:00:00.000000000', '2017-10-09T00:00:00.000000000',
'2017-10-10T00:00:00.000000000', '2017-10-11T00:00:00.000000000',
'2017-10-12T00:00:00.000000000'], dtype='datetime64[ns]')
工作方式
-
d
代表一天 -
x
变成没有时间戳的日期 -
diff
为我提供了天差...但是在timedelta
空间 - 我将我的
d
除以同样位于timedelta
的空间,并且尺寸消失了……剩下的是float
,我将其投射为int
- 当我将对
x[:, 0]
中的第一列添加到整数数组时,我得到广播,添加了一个任意维度的单元,无论维度是x
还是datetime64[D]
.所以我要增加一天.
d
represents one dayx
is turned into dates with no timestampsdiff
gets me the number of days difference... but intimedelta
space- I divide by my
d
which is also intimedelta
space and the dimensions disappear... leaving me withfloat
which I cast toint
- When I add the first column of the pairs
x[:, 0]
to an array of integers, I get a broadcasting of adding 1 unit of whatever the dimension is ofx
, which isdatetime64[D]
. So I'm adding one day.
源自@hpaulj或受其启发
如果他们发布答案将会删除
Derived from / Inspired by @hpaulj
Will remove if they post an answer
d = np.array(1, dtype='timedelta64[D]')
np.concatenate([np.arange(row[0], row[1] + 1, d) for row in x])
array(['2017-10-02T00:00:00.000000000', '2017-10-03T00:00:00.000000000',
'2017-10-04T00:00:00.000000000', '2017-10-05T00:00:00.000000000',
'2017-10-06T00:00:00.000000000', '2017-10-07T00:00:00.000000000',
'2017-10-08T00:00:00.000000000', '2017-10-09T00:00:00.000000000',
'2017-10-10T00:00:00.000000000', '2017-10-11T00:00:00.000000000',
'2017-10-12T00:00:00.000000000'], dtype='datetime64[ns]')
这篇关于获取多个日期时间对的日期范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!