Vertica 日期系列在指定日期前一个月开始 [英] Vertica date series is starting one month before specified date

查看:62
本文介绍了Vertica 日期系列在指定日期前一个月开始的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 Vertica 数据库,我需要进行一个查询,在给定两个日期的情况下,我会得到上述日期之间所有月份的列表.例如,如果我给出查询 2015-01-01 和 2015-12-31,它会输出以下列表:

I work with a Vertica database and I needed to make a query that, given two dates, would give me a list of all months between said dates. For example, if I were to give the query 2015-01-01 and 2015-12-31, it would output me the following list:

2015-01-01
2015-02-01
2015-03-01
2015-04-01
2015-05-01
2015-06-01
2015-07-01
2015-08-01
2015-09-01
2015-10-01
2015-11-01
2015-12-01

经过一番挖掘,我发现了以下查询:

After a bit of digging, I was able to discover the following query:

SELECT date_trunc('MONTH', ts)::date as Mois
FROM 
(
    SELECT '2015-01-01'::TIMESTAMP as tm
    UNION
    SELECT '2015-12-31'::TIMESTAMP as tm
) as t
TIMESERIES ts as '1 month' OVER (ORDER BY tm)

此查询有效并为我提供以下输出:

This query works and gives me the following output:

2014-12-01
2015-01-01
2015-02-01
2015-03-01
2015-04-01
2015-05-01
2015-06-01
2015-07-01
2015-08-01
2015-09-01
2015-10-01
2015-11-01
2015-12-01

如您所见,通过为查询提供2015-01-01"或一月的任何地方的开始日期,我最终会得到一个额外的条目,即 2014-12-01.就其本身而言,该错误(或任何您想称其为意外行为的行为)很容易规避(从 2 月开始),但我不得不承认我的好奇心被激怒了.为什么系列赛在我指定的日期前一个月开始?

As you can see, by giving the query a starting date of '2015-01-01' or anywhere in january for that matters, I end up with an extra entry, namely 2014-12-01. In itself, the bug (or whatever you want to call this unexpected behavior) is easy to circumvent (just start in february), but I have to admit my curiosity's piked. Why exactly is the serie starting one month BEFORE the date I specified?

好的,在阅读了 Kimbo 的警告并确认确实很长一段时间最终会导致问题之后,我能够提出以下正确重新调整日期的查询.

Alright, after reading Kimbo's warning and confirming that indeed, long periods will eventually cause problems, I was able to come up with the following query that readjusts the dates correctly.

SELECT ts as originalMonth, 
ts + 
    (
        mod
        (
            day(first_value(ts) over (order by ts)) - day(ts) + day(last_day(ts)), 
            day(last_day(ts))
        )
    ) as adjustedMonth
FROM 
(
    SELECT ts
    FROM 
    (
        SELECT '2015-01-01'::TIMESTAMP as tm
        UNION
        SELECT '2018-12-31'::TIMESTAMP as tm
    ) as t
    TIMESERIES ts as '1 month' OVER (ORDER BY tm)
) as temp

我唯一的问题是我无法控制系列第一张唱片的第一天.它由 Vertica 自动设置为当天.因此,如果我在本月 31 日运行此查询,我想知道它会如何表现.我想我只能等到 12 月才能看到,除非有人知道如何让时间序列以一种允许我测试的方式运行.

The only problem I have is that I have no control over the initial day of the first record of the series. It's set automatically by Vertica to the current day. So if I run this query on the 31st of the month, I wonder how it'll behave. I guess I'll just have to wait for december to see unless someone knows how to get timeseries to behave in a way that would allow me to test it.

好的,在尝试了许多不同的日期组合后,我能够确定系列开始的日期根据您指定的日期而变化.这造成了很多问题……直到我们决定走简单的道路.我们没有使用月间隔,而是使用日间隔,并且每个月只选择一个特定的日期.方式更简单,它一直有效.这是最终查询:

Okay, so after trying out many different date combinations, I was able to determine that the day which the series starts changes depending on the date you specify. This caused a whole lot of problems... until we decided to go the simple way. Instead of using a month interval, we used a day interval and only selected one specific day per month. WAY simpler and it works all the time. Here's the final query:

SELECT ts as originalMonth
FROM 
(
    SELECT ts
    FROM 
    (
        SELECT '2000-02-01'::TIMESTAMP as tm
        UNION
        SELECT '2018-12-31'::TIMESTAMP as tm
    ) as t
    TIMESERIES ts as '1 day' OVER (ORDER BY tm)
) as temp
where day(ts) = 1

推荐答案

我认为可以归结为以下文档中的声明:http://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/SQLReferenceManual/语句/SELECT/TIMESERIESClause.htm

I think it boils down to this statement from the doc: http://my.vertica.com/docs/7.1.x/HTML/index.htm#Authoring/SQLReferenceManual/Statements/SELECT/TIMESERIESClause.htm

TIME_SLICE 可以返回一个时间片的开始或结束时间,具体取决于在其第四个输入参数 (start_or_end) 的值上.TIMESERIES,另一方面,总是返回每个时间片的开始时间.

TIME_SLICE can return the start or end time of a time slice, depending on the value of its fourth input parameter (start_or_end). TIMESERIES, on the other hand, always returns the start time of each time slice.

当您定义具有某个开始日期(例如,2015-01-01)的时间间隔时,TIMESERIES ts AS '1 个月' 将为其创建第一个时间切片比第一个数据点早 1 个月开始,所以 2014-12-01.当您执行 DATE_TRUNC('MON', ts) 时,即使您的开始日期是 2015,这当然会将第一个日期值设置为 2014-12-01-01-03,或者别的什么.

When you define a time interval with some start date (2015-01-01, for example), then TIMESERIES ts AS '1 month' will create for its first time slice a slice that starts 1 month ahead of that first data point, so 2014-12-01. When you do DATE_TRUNC('MON', ts), that of course sets the first date value to 2014-12-01 even if your start date is 2015-01-03, or whatever.

e:我想再抛出一个警告——我认为您对 DATE_TRUNC 的使用达到了您的要求.但是,来自文档:与 TIME_SLICE 不同,[TIMESERIES] length_and_time_unit_expr 中表示的时间片长度和时间单位必须是常量,因此时间片中的间隙是明确定义的.这意味着1 个月"实际上正好是 30 天.如果您要使用几年以上,这显然有问题.

e: I want to throw out one more warning -- your use of DATE_TRUNC achieves what you need, I think. But, from the doc: Unlike TIME_SLICE, the time slice length and time unit expressed in [TIMESERIES] length_and_time_unit_expr must be constants so gaps in the time slices are well-defined. This means that '1 month' is actually 30 days exactly. This obviously has problems if you're going for more than a couple years.

这篇关于Vertica 日期系列在指定日期前一个月开始的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆