连接 2 个表,其中两组数字在连接列中重叠 [英] Join 2 tables where two sets of numbers overlap within the joining columns

查看:36
本文介绍了连接 2 个表,其中两组数字在连接列中重叠的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要用 postgresql 连接 2 个表,其中两组数字在连接列中重叠.

下图解释了这一点 - 我需要一张国会议员及其党派关系的表格,并将其与一张选区表(基于选区的绘制或重新绘制时间)相结合.结果将是显示地区、州和国会议员相同日期的行.如果某个地区的日期已知而国会议员的日期未知,则该地区的已知日期将填入该部分,而国会议员的日期则留空 - 反之亦然.

The image below explains it - I am needing to take a table of congresspeople and their party affiliation and join it with a table of districts (based on when the districts were drawn or redrawn). The result will be the rows that show the dates that the district, state and congressperson were the same. Wherever there are dates of a district that are known and the congressperson dates are unknown, the dates that are known for the district are filled for that portion, and the dates for the congressperson are left blank - and vice versa.

例如,对于表格中的第一行:

For example, for the first rows in the tables:

国会议员表:

Arkansas, District 5, Republican: 1940-1945

区表:

Arkansas, District 5: 1942-1963

产生以下组合(Start_Comb 和 End_Comb):

Results in the following combinations (Start_Comb and End_Comb):

1940-1942
1942-1945

对于地区未知的组合(1940-1942),地区日期留空.

And for the combination where the district is unknown (1940-1942), the district dates are left blank.

最后一组日期列(灰色)只是仅针对该地区的组合(这非常简单).

The final set of date columns (gray) is simply the combinations that are only for the district (this is super easy).

如果你想知道这是做什么的,我正在创建一个动画地图,有点像这样,但随着时间的推移,国会选区:https://www.youtube.com/watch?v=vQDyn04vtf8

In case you're wondering what this is for, I am creating an animated map, kind of like this, but for congressional districts over time: https://www.youtube.com/watch?v=vQDyn04vtf8

我最终会得到一张地图,其中对于每个已知的地区,都有一个已知或未知的政党.

I'll end up with something where there is a map where for every known district, there is a known or unknown party.

还没走多远,这就是我所做的:

Haven't got very far, this is what I did:

SELECT *
FROM congressperson
JOIN districts
ON Start_Dist BETWEEN Start_Cong AND End_Cong
WHERE district.A = district.B
OR End_Dist BETWEEN Start_Cong AND Start_Dist
OR Start_Cong = Start_Dist OR End_Cong= End_Dist;

推荐答案

这个想法是首先从两个表中列出唯一的日期.然后为每个这样的日期找到下一个日期(在这种特殊情况下,日期按州、地区分组,下一个日期是查找特定州、地区).
所以现在我们有了我们正在寻找的范围列表.现在我们可以根据需要的条件加入(对于这个特殊任务的左加入)其他表:

The idea is to make list of unique dates from both tables first. Then for each such date find next date (in this particular case dates are grouped by state, district, and next date is looked for particular state, district).
So now we have list of ranges we are looking for. Now we can join (for this paticular task left join) other tables by required conditions:

select
    r.state,
    c.start_cong,
    c.end_cong,
    c.party,
    coalesce(c.district, d.district) district,
    d.start_dist,
    d.end_dist,
    start_comb,
    end_comb,
    case when d.district is not null then start_comb end final_start,
    case when d.district is not null then end_comb end final_end
from (
    with dates as (
        select
            *
        from (
            SELECT 
                c.state,
                c.district,
                start_cong date
            FROM congressperson c
            union 
            SELECT
                c.state,
                c.district, 
                end_cong
            FROM congressperson c
            union 
            SELECT 
                d.state,
                d.district,
                start_dist
            FROM district d 
            union 
            SELECT
                d.state,
                d.district, 
                end_dist
            FROM district d 
        ) DATES
        group by 
            state,
            district,
            date
        order by 
            state,
            district,    
            date
    ) 
    select
        dates.state,
        dates.district,
        dates.date start_comb,
    (select 
        d.date 
    from 
        dates d
    where
        d.state = dates.state and
        d.district = dates.district and
        d.date > dates.date
    order by 
        d.date
    limit 1
    ) end_comb
    from 
        dates) r
left join congressperson c on 
                            c.state = r.state and
                            c.district = r.district and
                            start_comb between c.start_cong and c.end_cong and 
                            end_comb between c.start_cong and c.end_cong
left join district d on 
                        d.state = r.state and
                        d.district = r.district and
                        start_comb between d.start_dist and d.end_dist and 
                        end_comb between d.start_dist and d.end_dist
where
    end_comb is not null 
order by 
    r.state, coalesce(c.district, d.district), start_comb, end_comb, start_cong, end_cong

这篇关于连接 2 个表,其中两组数字在连接列中重叠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆