获取多个日期时间对的日期范围 [英] Getting date ranges for multiple datetime pairs

查看:78
本文介绍了获取多个日期时间对的日期范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出形状为(n, 2)的日期时间数组:

Given a datetime array of the shape (n, 2):

x = np.array([['2017-10-02T00:00:00.000000000', '2017-10-12T00:00:00.000000000']], dtype='datetime64[ns]') 

x的形状为(1, 2),但实际上可以是(n, 2)n >= 1.在每对中,第一个日期始终小于(或等于)第二个日期.我想获取x中每对日期之间所有日期范围的列表.这基本上是我正在做的:

x has shape (1, 2), but in reality it could be (n, 2), n >= 1. In each pair, the first date is always smaller than (or equal to) the second. I want to get a list of all date ranges between each pair of dates in x. This is what I'm doing basically:

np.concatenate([pd.date_range(*y, closed='right') for y in x])

它奏效了,给予

array(['2017-10-03T00:00:00.000000000', '2017-10-04T00:00:00.000000000',
       '2017-10-05T00:00:00.000000000', '2017-10-06T00:00:00.000000000',
       '2017-10-07T00:00:00.000000000', '2017-10-08T00:00:00.000000000',
       '2017-10-09T00:00:00.000000000', '2017-10-10T00:00:00.000000000',
       '2017-10-11T00:00:00.000000000', '2017-10-12T00:00:00.000000000'], dtype='datetime64[ns]')

但是由于list comp的缘故,这非常慢-并非完全按照我的意愿进行向量化.我想知道是否有更好的方法来获取多对日期的日期范围?

But this is pretty slow because of the list comp - it isn't exactly vectorised as I'd like. I'm wondering if there's a better way to obtain date ranges for multiple pairs of dates?

我将根据需要提供尽可能多的说明.谢谢.

I'll provide as much clarification as needed. Thanks.

推荐答案

这有点令人费解……
但是

It's a tad convoluted...
But

d = np.array(1, dtype='timedelta64[D]')
x = x.astype('datetime64[D]')
deltas = np.diff(x, axis=1) / d
np.concatenate([
    i + np.arange(j + 1) for i, j in zip(x[:, 0], deltas[:, 0].astype(int))
]).astype('datetime64[ns]')

array(['2017-10-02T00:00:00.000000000', '2017-10-03T00:00:00.000000000',
       '2017-10-04T00:00:00.000000000', '2017-10-05T00:00:00.000000000',
       '2017-10-06T00:00:00.000000000', '2017-10-07T00:00:00.000000000',
       '2017-10-08T00:00:00.000000000', '2017-10-09T00:00:00.000000000',
       '2017-10-10T00:00:00.000000000', '2017-10-11T00:00:00.000000000',
       '2017-10-12T00:00:00.000000000'], dtype='datetime64[ns]')


工作方式

  • d代表一天
  • x变成没有时间戳的日期
  • diff为我提供了天差...但是在timedelta空间
  • 我将我的d除以同样位于timedelta的空间,并且尺寸消失了……剩下的是float,我将其投射为int
  • 当我将对x[:, 0]中的第一列添加到整数数组时,我得到广播,添加了一个任意维度的单元,无论维度是x还是datetime64[D].所以我要增加一天.
  • d represents one day
  • x is turned into dates with no timestamps
  • diff gets me the number of days difference... but in timedelta space
  • I divide by my d which is also in timedelta space and the dimensions disappear... leaving me with float which I cast to int
  • When I add the first column of the pairs x[:, 0] to an array of integers, I get a broadcasting of adding 1 unit of whatever the dimension is of x, which is datetime64[D]. So I'm adding one day.

源自@hpaulj或受其启发
如果他们发布答案将会删除

Derived from / Inspired by @hpaulj
Will remove if they post an answer

d = np.array(1, dtype='timedelta64[D]')
np.concatenate([np.arange(row[0], row[1] + 1, d) for row in x])

array(['2017-10-02T00:00:00.000000000', '2017-10-03T00:00:00.000000000',
       '2017-10-04T00:00:00.000000000', '2017-10-05T00:00:00.000000000',
       '2017-10-06T00:00:00.000000000', '2017-10-07T00:00:00.000000000',
       '2017-10-08T00:00:00.000000000', '2017-10-09T00:00:00.000000000',
       '2017-10-10T00:00:00.000000000', '2017-10-11T00:00:00.000000000',
       '2017-10-12T00:00:00.000000000'], dtype='datetime64[ns]')

这篇关于获取多个日期时间对的日期范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆