在大查询中按月和年按日期组的频率填充零 [英] Fill with zeros in a frequency of dates group by month and year in big query
问题描述
我有一个表,其中包含hiratedate(日期),First Name(字符串)和Sur Name(字符串),如下所示:
I have a table with hiredate (Date) ,First Name (String) and Sur Name (string) like this:
hireDate First Name Surname
13-oct-14 Cintia Roxana Padilla Julca
28-oct-14 Conor McAteer
28-oct-14 Paolo Mesia Macher
28-oct-14 William Anthony Whelan
15-nov-14 Peter Michael Coates
13-feb-15 Natalie Conche
15-mar-15 Beatriz Vargas Huanca
01-may-15 Walter Calle Chenccnes
04-may-15 Sarah Louise Price
然后我查看了hiring_dates(DATE)的频率和另一列中的累积频率,如下所示:
And I made a view of a frequency of hire_dates(DATE) and the cumulative frequency in the other column like this:
Row hireDate Count Cumulative
1 13/10/2014 1 1
2 28/10/2014 3 4
3 15/11/2014 1 5
4 13/02/2015 1 6
5 15/03/2015 1 7
6 09/04/2015 1 8
7 15/04/2015 1 9
8 01/05/2015 1 10
查询如下:
WITH
Data AS (
SELECT
hireDate,
COUNT(1) AS Count
FROM
`human-resources-221122.human_resources.employees_view`
WHERE
status <> "cancelled"
GROUP BY
1 )
SELECT
hireDate,
Count,
SUM(Count) OVER (ORDER BY hireDate ASC ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW) AS Cumulative
FROM
Data
ORDER BY
hireDate ASC
但是我需要在没有计数的地方按月和年查看带有零的数字,如下所示:
But I need to see the numbers by month and year with zeros in those places that there are no count, something like this:
Hire_Month Hire_Year Count Cumulative
October 2014 4 4
November 2014 1 5
December 2014 0 5
January 2015 0 5
February 2015 1 6
March 2015 1 7
April 2015 2 9
May 2015 1 10
谢谢.
推荐答案
请注意,使用 GENERATE_DATE_ARRAY
和 RIGHT JOIN
可获得所需的结果:
Note the use of GENERATE_DATE_ARRAY
and RIGHT JOIN
to get the desired results:
WITH data AS (
SELECT *
FROM UNNEST ([
STRUCT(DATE("2014-12-03") AS d, 4 AS a)
, STRUCT("2015-01-05", 7)
, STRUCT("2015-03-05", 1)
])
), all_months AS (
SELECT month
FROM UNNEST(GENERATE_DATE_ARRAY(
(SELECT DATE_TRUNC(MIN(d), MONTH) FROM data)
, (SELECT MAX(d) FROM data)
, INTERVAL 1 MONTH)
) AS month
)
SELECT month, IFNULL(SUM(a),0) a, SUM(SUM(a)) OVER(ORDER BY month) a_cum
FROM data
RIGHT JOIN all_months
ON DATE_TRUNC(d, MONTH)=month
GROUP BY month
ORDER BY month
现在,如果我们只是在计算,您可以使用以下事实:"LEFT/RIGHT JOIN"在空的月份中将为空.这就是查询如何适应任意表的方法(此处为Wikipedia):
Now, if we are just counting, you can use the fact that a LEFT/RIGHT JOIN will have nulls for empty months. This is how the query could be adapted to an arbitrary table (Wikipedia here):
WITH data AS (
SELECT *, DATE(datehour) d
FROM `fh-bigquery.wikipedia_v3.pageviews_2018`
WHERE wiki='pt'
AND (datehour BETWEEN '2018-09-30' AND '2018-09-30'
OR datehour BETWEEN '2018-12-01' AND '2018-12-02'
)
AND title LIKE 'Calif%'
), all_months AS (
SELECT month
FROM UNNEST(GENERATE_DATE_ARRAY(
(SELECT DATE_TRUNC(MIN(d), MONTH) FROM data)
, (SELECT MAX(d) FROM data)
, INTERVAL 1 MONTH)
) AS month
)
SELECT month, COUNT(d) c, SUM(COUNT(d)) OVER(ORDER BY month) a_cum
FROM data
RIGHT JOIN all_months
ON DATE_TRUNC(d, MONTH)=month
GROUP BY month
ORDER BY month
这篇关于在大查询中按月和年按日期组的频率填充零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!