如何在Redshift中创建日期表? [英] How do I create a dates table in Redshift?
问题描述
tl; dr:我想在Redshift中生成一个日期表,以使报表更易于生成.更可取的是,不需要在redshift中已经有大表,而需要上传一个csv文件.
tl;dr: I want to generate a dates table in Redshift in order to make a report easier to generate. Preferable without needing large tables already in redshift, needing to upload a csv file.
长版: 我正在处理一份报告,其中我必须平均一周中每天创建的新项目.日期范围可能跨越数月或更长时间,因此,例如可能有5个星期一,但只有4个星期日,这可能会使数学有些棘手.另外,我不能保证每天都有一个实例,尤其是一旦用户开始切片数据时.哪,这会破坏BI工具.
long version: I am working on a report where I have to average new items created per day of the week. The date range could span months or more, so there could be, say, 5 Mondays but only 4 Sundays, which can make the math a little tricky. Also, I am not guaranteed an instance of a single item per day, especially once a user starts slicing the data. Which, this is tripping up the BI tool.
解决此问题的最佳方法很可能是日期表.但是,大多数有关日期表的教程都使用Redshift不可用或不完全支持的SQL命令(我在找您, generate_series ).
The best way to tackle this problem is most likely a dates table. However, most of the tutorials for dates tables use SQL commands that are not available or not fully supported by Redshift (I'm looking at you, generate_series).
是否有一种简单的方法可以在Redshift中生成日期表?
The code I was attempting to use: (based on this also-not-working recommendation: http://elliot.land/post/building-a-date-dimension-table-in-redshift )
CREATE TABLE facts.dates (
"date_id" INTEGER NOT NULL PRIMARY KEY,
-- DATE
"full_date" DATE NOT NULL,
-- YEAR
"year_number" SMALLINT NOT NULL,
"year_week_number" SMALLINT NOT NULL,
"year_day_number" SMALLINT NOT NULL,
-- QUARTER
"qtr_number" SMALLINT NOT NULL,
-- MONTH
"month_number" SMALLINT NOT NULL,
"month_name" CHAR(9) NOT NULL,
"month_day_number" SMALLINT NOT NULL,
-- WEEK
"week_day_number" SMALLINT NOT NULL,
-- DAY
"day_name" CHAR(9) NOT NULL,
"day_is_weekday" SMALLINT NOT NULL,
"day_is_last_of_month" SMALLINT NOT NULL
) DISTSTYLE ALL SORTKEY (date_id)
;
INSERT INTO facts.dates
(
"date_id"
,"full_date"
,"year_number"
,"year_week_number"
,"year_day_number"
-- QUARTER
,"qtr_number"
-- MONTH
,"month_number"
,"month_name"
,"month_day_number"
-- WEEK
,"week_day_number"
-- DAY
,"day_name"
,"day_is_weekday"
,"day_is_last_of_month"
)
SELECT
cast(seq + 1 AS INTEGER) AS date_id,
-- DATE
datum AS full_date,
-- YEAR
cast(extract(YEAR FROM datum) AS SMALLINT) AS year_number,
cast(extract(WEEK FROM datum) AS SMALLINT) AS year_week_number,
cast(extract(DOY FROM datum) AS SMALLINT) AS year_day_number,
-- QUARTER
cast(to_char(datum, 'Q') AS SMALLINT) AS qtr_number,
-- MONTH
cast(extract(MONTH FROM datum) AS SMALLINT) AS month_number,
to_char(datum, 'Month') AS month_name,
cast(extract(DAY FROM datum) AS SMALLINT) AS month_day_number,
-- WEEK
cast(to_char(datum, 'D') AS SMALLINT) AS week_day_number,
-- DAY
to_char(datum, 'Day') AS day_name,
CASE WHEN to_char(datum, 'D') IN ('1', '7')
THEN 0
ELSE 1 END AS day_is_weekday,
CASE WHEN
extract(DAY FROM (datum + (1 - extract(DAY FROM datum)) :: INTEGER +
INTERVAL '1' MONTH) :: DATE -
INTERVAL '1' DAY) = extract(DAY FROM datum)
THEN 1
ELSE 0 END AS day_is_last_of_month
FROM
-- Generate days for 81 years starting from 2000.
(
SELECT
'2000-01-01' :: DATE + generate_series AS datum,
generate_series AS seq
FROM generate_series(0,81 * 365 + 20,1)
) DQ
ORDER BY 1;
哪个会引发此错误
[Amazon](500310) Invalid operation: Specified types or functions (one per INFO message) not supported on Redshift tables.;
1 statement failed.
...,因为我认为在Redshift的同一命令中不允许INSERT和 generate_series
... because, I assume, INSERT and generate_series are not allowed in the same command in Redshift
推荐答案
在提出问题时,我发现了问题.糟糕!
In asking the question, I figured it out. Oops.
我从事实"模式开始.
CREATE SCHEMA facts;
运行以下命令以启动数字表:
Run the following to start a numbers table:
create table facts.numbers
(
number int PRIMARY KEY
)
;
使用它来生成您的号码列表.我花了一百万开始
Use this to generate your number list. I used a million to get started
SELECT ',(' || generate_series(0,1000000,1) || ')'
;
然后将您的结果中的数字复制粘贴到以下查询中,位于VALUES之后:
Then copy-paste the numbers from your results in the query below, after VALUES:
INSERT INTO facts.numbers
VALUES
(0)
,(1)
,(2)
,(3)
,(4)
,(5)
,(6)
,(7)
,(8)
,(9)
-- etc
^确保从复制粘贴的数字列表中删除前导逗号
^ Make sure to remove the leading comma from the copy-pasted list of numbers
一旦有了数字表,就可以生成日期表(同样,从elliot land
Once you have a numbers table, then you can generate a dates table (again, stealing code from elliot land http://elliot.land/post/building-a-date-dimension-table-in-redshift ) :
CREATE TABLE facts.dates (
"date_id" INTEGER NOT NULL PRIMARY KEY,
-- DATE
"full_date" DATE NOT NULL,
-- YEAR
"year_number" SMALLINT NOT NULL,
"year_week_number" SMALLINT NOT NULL,
"year_day_number" SMALLINT NOT NULL,
-- QUARTER
"qtr_number" SMALLINT NOT NULL,
-- MONTH
"month_number" SMALLINT NOT NULL,
"month_name" CHAR(9) NOT NULL,
"month_day_number" SMALLINT NOT NULL,
-- WEEK
"week_day_number" SMALLINT NOT NULL,
-- DAY
"day_name" CHAR(9) NOT NULL,
"day_is_weekday" SMALLINT NOT NULL,
"day_is_last_of_month" SMALLINT NOT NULL
) DISTSTYLE ALL SORTKEY (date_id)
;
INSERT INTO facts.dates
(
"date_id"
,"full_date"
,"year_number"
,"year_week_number"
,"year_day_number"
-- QUARTER
,"qtr_number"
-- MONTH
,"month_number"
,"month_name"
,"month_day_number"
-- WEEK
,"week_day_number"
-- DAY
,"day_name"
,"day_is_weekday"
,"day_is_last_of_month"
)
SELECT
cast(seq + 1 AS INTEGER) AS date_id,
-- DATE
datum AS full_date,
-- YEAR
cast(extract(YEAR FROM datum) AS SMALLINT) AS year_number,
cast(extract(WEEK FROM datum) AS SMALLINT) AS year_week_number,
cast(extract(DOY FROM datum) AS SMALLINT) AS year_day_number,
-- QUARTER
cast(to_char(datum, 'Q') AS SMALLINT) AS qtr_number,
-- MONTH
cast(extract(MONTH FROM datum) AS SMALLINT) AS month_number,
to_char(datum, 'Month') AS month_name,
cast(extract(DAY FROM datum) AS SMALLINT) AS month_day_number,
-- WEEK
cast(to_char(datum, 'D') AS SMALLINT) AS week_day_number,
-- DAY
to_char(datum, 'Day') AS day_name,
CASE WHEN to_char(datum, 'D') IN ('1', '7')
THEN 0
ELSE 1 END AS day_is_weekday,
CASE WHEN
extract(DAY FROM (datum + (1 - extract(DAY FROM datum)) :: INTEGER +
INTERVAL '1' MONTH) :: DATE -
INTERVAL '1' DAY) = extract(DAY FROM datum)
THEN 1
ELSE 0 END AS day_is_last_of_month
FROM
-- Generate days for 81 years starting from 2000.
(
SELECT
'2000-01-01' :: DATE + number AS datum,
number AS seq
FROM facts.numbers
WHERE number between 0 and 81 * 365 + 20
) DQ
ORDER BY 1;
^请确保在所需日期范围的末尾设置数字
^ Be sure to set the numbers at the end for the date range you need
这篇关于如何在Redshift中创建日期表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!