在有联邦州的国家/地区度假的岛屿和岛屿 [英] Gaps and islands for school vacations in a country with federal states
问题描述
periods
表包含了学生休学期间的数据. locations
表包含国家和联邦州.我想列出从'2019-12-15'到'2020-01-15'的所有时间段,并累积每个岛屿的休息日总数.
The periods
table contains data of periods where school is off for students. The locations
table contains countries and federal states. I want to list all periods from the '2019-12-15' to the '2020-01-15' and accumulate the total number of off days for each island.
要收集数据,我必须搜索位置1和位置2的所有时段.周末是与国家相关的时段,而学校假期与联邦州相关.
To collect the data I have to search for all periods of the location 1 and the location 2. Because e.g. weekends are periods which are associated to the country and school vacations are associated to the federal state.
所有数据都存储在PostgreSQL数据库中.我无法控制表格结构.
All data is stored in a PostgreSQL database. I have no control over the table structure.
位置
|----|-------------------|------------------|------------|
| id | name | is_federal_state | is_country |
|----|-------------------|------------------|------------|
| 1 | Deutschland | f | t |
| 2 | Baden-Württemberg | t | f |
|----|-------------------|------------------|------------|
德国"是德国的国家.巴登-符腾堡州"是德国的联邦州.
"Deutschland" is the country Germany. "Baden-Württemberg" is a federal state of Germany.
期间
|-----|------------|------------|-------------|
| id | starts_on | ends_on | location_id |
|-----|------------|------------|-------------|
| 678 | 2019-12-21 | 2019-12-22 | 1 |
| 534 | 2019-12-23 | 2020-01-04 | 2 |
| 679 | 2019-12-28 | 2019-12-29 | 1 |
| 9 | 2020-01-01 | 2020-01-01 | 2 |
| 776 | 2020-01-04 | 2020-01-05 | 1 |
| 7 | 2020-01-06 | 2020-01-06 | 2 |
| 777 | 2020-01-11 | 2020-01-12 | 1 |
|-----|------------|------------|-------------|
periods
表存储有关学生不必上学的时间段的信息.678、679、776和777是与国家相关联的周末(编号1).另一个是与联邦州(id 2)相关的学校假期或公共假期.
The periods
table stores information about periods where students don't have to attend school. 678, 679, 776 and 777 are weekends which are associated to the country (id 1). The other are school vacations or public holidays which are associated to the federal state (id 2).
在的帮助下延期日期范围并累积其持续时间我发现这是一个空白和孤岛的问题.但是我不知道如何正确使用它.这就是我所拥有的:
With the help of Find rows with adjourning date ranges and accumulate their durations I found out that this is a gaps and islands problem. But I don't understand how I can use this properly. This is what I have:
SELECT
p.*,
(
Max(ends_on) OVER (PARTITION BY location_id) - Min(starts_on) OVER (PARTITION BY location_id)
)
+ 1 AS duration
FROM
(
SELECT
p.*,
Count(*) FILTER (
WHERE
prev_eo < starts_on - INTERVAL '1 day') OVER (PARTITION BY location_id
ORDER BY
starts_on) AS grp
FROM
(
SELECT
id,
starts_on,
ends_on,
location_id,
lag(ends_on) OVER (PARTITION BY location_id
ORDER BY
(starts_on)) AS prev_eo
FROM
periods
WHERE
location_id IN
(
1,
2
)
AND starts_on > '2019-12-15'
AND starts_on < '2020-01-15'
)
p
)
p;
结果
|-----|------------|------------|-------------|-----|----------|
| id | starts_on | ends_on | location_id | grp | duration |
|-----|------------|------------|-------------|-----|----------|
| 678 | 2019-12-21 | 2019-12-22 | 1 | 0 | 23 |
| 679 | 2019-12-28 | 2019-12-29 | 1 | 1 | 23 |
| 776 | 2020-01-04 | 2020-01-05 | 1 | 2 | 23 |
| 777 | 2020-01-11 | 2020-01-12 | 1 | 3 | 23 |
| 534 | 2019-12-23 | 2020-01-04 | 2 | 0 | 15 |
| 9 | 2020-01-01 | 2020-01-01 | 2 | 0 | 15 |
| 7 | 2020-01-06 | 2020-01-06 | 2 | 1 | 15 |
|-----|------------|------------|-------------|-----|----------|
此结果存在问题:
- 此结果假设位置ID 1的所有周末都是一个大岛.但事实并非如此.除了一个周末外,所有假期都属于圣诞节假期岛.
- ID 2的实际圣诞节假期应导致持续时间为15,其中应包括ID 678,该ID应该是圣诞节岛的一部分(加上该期间的其他周末).
|-----|------------|------------|-------------|----------|
| id | starts_on | ends_on | location_id | duration |
|-----|------------|------------|-------------|----------|
| 678 | 2019-12-21 | 2019-12-22 | 1 | 15 |
| 679 | 2019-12-28 | 2019-12-29 | 1 | 15 |
| 776 | 2020-01-04 | 2020-01-05 | 1 | 15 |
| 777 | 2020-01-11 | 2020-01-12 | 1 | 2 |
| 534 | 2019-12-23 | 2020-01-04 | 2 | 15 |
| 9 | 2020-01-01 | 2020-01-01 | 2 | 15 |
| 7 | 2020-01-06 | 2020-01-06 | 2 | 15 |
|-----|------------|------------|-------------|----------|
哪个查询将在此结果中生成?
所有数据和查询: https://rextester.com/FEL57082
推荐答案
代码中有错误. grp
已计算,但未使用.所以,我想你想要
There is a bug in the code. The grp
is calculated, but not used. So, I think you want:
SELECT p.*,
(Max(ends_on) OVER (PARTITION BY location_id, grp) - Min(starts_on) OVER (PARTITION BY location_id, grp)
) + 1 AS duration
FROM (SELECT p.*,
Count(*) FILTER (WHERE prev_eo < starts_on - INTERVAL '1 day') OVER
(PARTITION BY location_id
ORDER BY starts_on
) AS grp
FROM (SELECT p.*,
lag(ends_on) OVER (PARTITION BY location_id ORDER BY starts_on) AS prev_eo
FROM periods p
WHERE location_id IN (1, 2) AND
starts_on > '2019-12-15' AND
starts_on < '2020-01-15'
) p
) p;
但是,这将在周末返回 2
,而不是 17
.
However, this returns 2
for the weekends, not 17
.
此处是您对此查询的Rextester.
Here is your Rextester with this query.
这篇关于在有联邦州的国家/地区度假的岛屿和岛屿的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!