查找两个日期之间覆盖范围更广的员工的更有效方法 [英] More efficient way to find employees with coverage between two dates

查看:87
本文介绍了查找两个日期之间覆盖范围更广的员工的更有效方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要检索一份雇员清单,并为每位雇员列出他们在给定年度内积极参与福利保障的几个月的清单.有一个包含工作数据的表和一个包含福利信息的表.还有一个交付日期表,该表列出了2007-2018年的每个日期,并针对每个日期显示月,日和日历年.

I have a requirement to retrieve a list of employees, and for each employee a list of months they were actively on benefits coverage in a given year. There is a table with job data, and a table with benefits information. There is also a delivered dates table that lists out every date from 2007-2018 and for each date it shows the day of month, month of year, and calendar year.

我现在编写查询的方式是说:在日期表中找到提示年份的01/01和12/31之间的所有日期(即1)(或当前日期,以较早的日期为准), 2)在员工活跃于福利表期间.对于每个日期,我还希望截至该日期的职位表中的部门主管和福利表中的福利计划.然后我做一个不同的,只显示每个员工的月份和日历年.

The way I have written the query now is to say: find all the dates on the dates table that are 1) between 01/01 and 12/31 of the prompt year(or the current date, whichever is older), 2) during the time the employee was active on the benefits table. For each date I also want the deptid from the jobs table and the benefit plan from the benefit table as of that date. Then I do a distinct, only showing the month of year, and calendar year for each employee.

这行得通,但是当我尝试为其中有很多人的部门这样做时,问题就来了.我认为,这需要很长时间才能运行,因为它只为每个员工检索最多365行,然后仅显示其中的12行,因为这只需要花几个月的时间.我觉得有更好的方法可以做到这一点,我只是想不出它是什么.

This works, but the problem comes when I try to do it for departments with lots of people in them. It takes a very long time to run, I believe because it is retrieving up to 365 rows for every single employee and then only showing 12 of those, since it is only pulling distinct months. I feel like there is a better way to do this, I just can't think of what it is.

以下是我正在使用的表格的一些简化示例:

Here are some simplified examples of the tables I'm working with:

THE_DATE   MONTHOFYEAR   CALENDAR_YEAR
01-OCT-15  10            2015
02-OCT-15  10            2015
03-OCT-15  10            2015
...

工作表

(A =有效; I =无效)

Jobs Table

(A=Active; I=Inactive)

EMPLID     EFFDT         DEPTID           HR_STATUS
00123      01-FEB-15     900              A
00123      30-JUN-15     900              I
00123      01-AUG-15     901              A

收益表

EMPLID     EFFDT         BENEFIT_PLAN     STATUS
00123      01-MAR-15     PPO              A
00123      31-JUL-15                      I
00123      01-SEP-15     HMO              A

所需结果

EMPLID     CALENDAR_YEAR MONTHOFYEAR      DEPTID         BENEFIT_PLAN
00123      2015          3                900            PPO
00123      2015          4                900            PPO
00123      2015          5                900            PPO
00123      2015          6                900            PPO
00123      2015          7                900            PPO
00123      2015          9                901            HMO
00123      2015          10               901            HMO
00123      2015          11               901            HMO
^ (shows November row even though employee was only covered for part of this month)

示例SQL可获得以上结果

SELECT DISTINCT J.EMPLID, D.CALENDAR_YEAR, D.MONTHOFYEAR, J.DEPTID, B.BENEFIT_PLAN
FROM DATES D, 
     JOBS J 
     JOIN 
     BENEFITS B 
     ON J.EMPLID = B.EMPLID
WHERE D.THE_DATE <= SYSDATE
AND D.THE_DATE BETWEEN 
        TO_DATE(:YEAR_PROMPT || '01-01', 'YYYY-MM-DD') 
        AND 
        TO_DATE(:YEAR_PROMPT || '12-31', 'YYYY-MM-DD')
AND B.STATUS = 'A'
AND D.THE_DATE BETWEEN 
        B.EFFDT 
        AND 
        NVL(SELECT MIN(B_ED.EFFDT) 
            FROM BENEFITS B_ED
            WHERE B_ED.EMPLID = B.EMPLID
            AND B_ED.EFFDT > B.EFFDT
        , SYSDATE)
AND J.EFFDT = (SELECT MAX(J_ED.EFFDT)
               FROM JOBS J_ED
               WHERE J_ED.EMPLID = J.EMPLID
               AND J_ED.EFFDT <= D.THE_DATE)

我不是说检索每个日期并检查是否符合条件",而是可以改变逻辑以得到相同的结果而无需经过太多行吗?

Instead of saying "retrieve every single date and check to see if it fits the criteria", can I change up the logic somehow to get the same results without churning through so many rows?

推荐答案

是;通过使用LEAD()分析功能,您可以计算出工作和收益表中的下一个effdt,这使得在范围之间进行查询变得更加容易.

Yes; by using the LEAD() analytic function, you can calculate the next effdt in the jobs and benefits tables, which makes it easier to query between the ranges.

类似的东西:

with dates as (select trunc(sysdate, 'yyyy') - 1 + level the_date,
                      to_number(to_char(trunc(sysdate, 'yyyy') - 1 + level, 'mm')) monthofyear,
                      to_number(to_char(sysdate, 'yyyy')) calendar_year
               from   dual
               connect by level <= 365),
      jobs as (select 123 emplid, to_date('01/02/2015', 'dd/mm/yyyy') effdt, 900 deptid, 'A' hr_status from dual union all
               select 123 emplid, to_date('30/06/2015', 'dd/mm/yyyy') effdt, 900 deptid, 'I' hr_status from dual union all
               select 123 emplid, to_date('01/08/2015', 'dd/mm/yyyy') effdt, 901 deptid, 'A' hr_status from dual),
  benefits as (select 123 emplid, to_date('01/03/2015', 'dd/mm/yyyy') effdt, 'PPO' benefit_plan, 'A' status from dual union all
               select 123 emplid, to_date('31/07/2015', 'dd/mm/yyyy') effdt, null benefit_plan, 'I' status from dual union all
               select 123 emplid, to_date('01/09/2015', 'dd/mm/yyyy') effdt, 'HMO' benefit_plan, 'A' status from dual),
-- ********* end of mimicking your tables ********* --
         j as (select emplid,
                      effdt,
                      deptid,
                      hr_status,
                      lead(effdt, 1, sysdate) over (partition by emplid order by effdt) next_effdt
               from   jobs),
         b as (select emplid,
                      effdt,
                      benefit_plan,
                      status,
                      lead(effdt, 1, sysdate) over (partition by emplid order by effdt) next_effdt
               from   benefits)
select distinct j.emplid,
                d.calendar_year,
                d.monthofyear,
                j.deptid,
                b.benefit_plan
from   j
       inner join dates d on (d.the_date >= j.effdt and d.the_date < j.next_effdt)
       inner join b on (j.emplid = b.emplid)
where  d.the_date <= sysdate
and    d.the_date between to_date (:year_prompt || '01-01', 'YYYY-MM-DD')
                      and to_date (:year_prompt || '12-31', 'YYYY-MM-DD') -- if no index on d.the_date, maybe use trunc(the_date, 'yyyy') = :year_prompt
and    b.status = 'A'
and    d.the_date between b.effdt and b.next_effdt
order by 1, 4, 2, 3;

    EMPLID CALENDAR_YEAR MONTHOFYEAR     DEPTID BENEFIT_PLAN
---------- ------------- ----------- ---------- ------------
       123          2015           3        900 PPO         
       123          2015           4        900 PPO         
       123          2015           5        900 PPO         
       123          2015           6        900 PPO         
       123          2015           7        900 PPO         
       123          2015           9        901 HMO         
       123          2015          10        901 HMO         
       123          2015          11        901 HMO   

(显然,您可以从上面的查询中排除datesjobsbenefits子查询,因为您已经有了这些表.它们仅出现在查询中以模拟具有包含该数据的表无需实际创建表.)

(Obviously, you can exclude the dates, jobs and benefits subqueries from the above query, since you already have those tables. They're only present in the query to simulate having tables with that data in it without needing to actually create the tables.).

ETA:这是一个仅根据传入的年份计算12个月的版本,它将日期行减少为12,而不是365/366行.

ETA: Here's a version that just calculates the 12 months based on the year that's passed in, which reduces the date rows to 12, rather than 365/366 rows.

不幸的是,考虑到同一个月开始有多行的情况,您仍然需要使用不同的字符.

Unfortunately, you'll still need the distinct, to take account of when you have multiple rows starting in the same month.

例如,对于以下示例中的数据,如果删除了不重复的数据,则最终将在6个月中有3行.但是,唯一身份操作的行数将大大少于以前.

For example, with the data in the following example, you would end up with 3 rows for month 6 if you removed the distinct. However, the number of rows the distinct is operating over will be far less than previously.

with dates as (select add_months(to_date(:year_prompt || '-01-01', 'YYYY-MM-DD'), - 1 + level) the_date,
                      level monthofyear,
                      :year_prompt calendar_year -- assuming this is a number
               from   dual
               connect by level <= 12),
      jobs as (select 123 emplid, to_date('01/02/2015', 'dd/mm/yyyy') effdt, 900 deptid, 'A' hr_status from dual union all
               select 123 emplid, to_date('15/06/2015', 'dd/mm/yyyy') effdt, 900 deptid, 'I' hr_status from dual union all
               select 123 emplid, to_date('26/06/2015', 'dd/mm/yyyy') effdt, 900 deptid, 'A' hr_status from dual union all
               select 123 emplid, to_date('01/08/2015', 'dd/mm/yyyy') effdt, 901 deptid, 'A' hr_status from dual),
  benefits as (select 123 emplid, to_date('01/03/2015', 'dd/mm/yyyy') effdt, 'PPO' benefit_plan, 'A' status from dual union all
               select 123 emplid, to_date('31/07/2015', 'dd/mm/yyyy') effdt, null benefit_plan, 'I' status from dual union all
               select 123 emplid, to_date('01/09/2015', 'dd/mm/yyyy') effdt, 'HMO' benefit_plan, 'A' status from dual),
-- ********* end of mimicking your tables ********* --
         j as (select emplid,
                      trunc(effdt, 'mm') effdt,
                      deptid,
                      hr_status,
                      trunc(coalesce(lead(effdt) over (partition by emplid order by effdt) -1, sysdate), 'mm') end_effdt
                        -- subtracting 1 from the lead(effdt) since here since the original sql had d.the_date < j.next_effdt and we need
                        -- to take into account when the next_effdt is the first of the month; we want the previous month to be displayed
               from   jobs),
         b as (select emplid,
                      trunc(effdt, 'mm') effdt,
                      benefit_plan,
                      status,
                      trunc(lead(effdt, 1, sysdate) over (partition by emplid order by effdt), 'mm') end_effdt
               from   benefits)
select distinct j.emplid,
                d.calendar_year,
                d.monthofyear,
                j.deptid,
                b.benefit_plan
from   j
       inner join dates d on (d.the_date between j.effdt and j.end_effdt)
       inner join b on (j.emplid = b.emplid)
where  d.the_date <= sysdate
and    b.status = 'A'
and    d.the_date between b.effdt and b.end_effdt
order by 1, 4, 2, 3;

    EMPLID CALENDAR_YEAR MONTHOFYEAR     DEPTID BENEFIT_PLAN                    
---------- ------------- ----------- ---------- --------------------------------
       123 2015                    3        900 PPO                             
       123 2015                    4        900 PPO                             
       123 2015                    5        900 PPO                             
       123 2015                    6        900 PPO                             
       123 2015                    6        900 PPO                             
       123 2015                    7        900 PPO                             
       123 2015                    9        901 HMO                             
       123 2015                   10        901 HMO                             
       123 2015                   11        901 HMO    

这篇关于查找两个日期之间覆盖范围更广的员工的更有效方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆