在六个月内保留的参与者的百分比 [英] % of participants that are retained during six months

查看:125
本文介绍了在六个月内保留的参与者的百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一名非常新的MS SQL服务器的学校老师。每个人都建议尝试这个网站。开始!



我正在尝试撰写查询,以测试参加学术课程的不同类型的结果指标。有几种不同的方法来计算这个结果测量我想尝试。我试图计算的结果是:在程序六个月内保留的参与者的百分比是多少?我正在测试不同的方式来定义参与者和不同的时间范围。我正在尝试制作4个查询。不幸的是,我必须使用不同的表格:考勤,状态,Deenrolled,无效。我已经包含了每个下面的样本数据

查询




  1. 参与者被定义从2012年7月1日至2013年6月30日,每周至少两次上课一次,共6个月(共181天),所以会计年度的长短。如果参与者退出不活动,它们将被删除。

  2. 从2013年1月1日起,参加者被定义为每周至少两次参加课程,共6个月(共181天)。如果参与者已取消注册

  3. 参与者被定义为从2013年1月1日起每周至少参加班级两次的所有人,直到今天

  4. 参与者被定义为学生的注册开始日期,直到它们被取消注册或变为非活动状态。

参与者(分子)参与者/所有送达的学生(分母)



我正在寻找的4个查询输出是不同的版本:



示例

 参与者服务Percent_Served 
75 100 75%

我已经在使用不同版本的查询下面混淆了

  SELECT 
Count(distinct ID)as计数,
计数(DATEADD(dd,-181,DATEADD(wk,DATEDIFF(wk,0,Date),0))> 2作为参与者,
FROM Attendance
其中Attendence_date日期$ 07/01/2012'和'06 / 30/2013'
和ID不在(从非活动中选择ID)
或ID不在(从Deenrolled中选择ID)
GROUP BY ID

 code> SELECT 
Count(不同的ID)作为Count,
计数(DATEADD(dd,-181,DATEADD(wk,DA TEDIFF(wk,0,Date),0)) - 作为参与者的Enrolled_Date,
从出席者
其中Attendence_date日期在'07 / 01/2012'和'06 / 30/2013'之间
和ID不在(从非活动中选择ID)
或ID不在(从Deenrolled中选择ID)
GROUP BY ID

对这些查询的任何编程帮助都非常感谢。



以下是示例/示例数据集。



Attendence_date是学生参加一个班级的日期。

  CREATE TABLE Attendance(
ID int,
Attendence_date datetime


INSERT INTO考勤价值
(4504498,'7/1/2012'),
(4504498,'7/2/2012'),
(4504498,'7 / 3/2012'),
(4504498,'7/4/2012'),
(4504498,'7/5/2012'),
(4504498,'7/8 / 2012年),
(4504498,'7/9/2012'),
(4504498,'7/10/2012'),
(4504498,'7/11/2012' ),
(4504498,'7/12/2012'),
(4504498,'7/1/2012'),
(4504498,'7/2/2012'),
(4504498,'7/3/2012'),
(4504498,'7/4/2012'),
(4504498,'7/5/2012'),
(4504498,'7/8/2012'),
(4504498,'7/9/2012'),
(4504498,'7/10/2012'),
(4504498,'7/11/2012'),
(4504498,'7/12/2012'),
(9201052,'7/15/2012'),
(9201052 ,'7/16/2012'),
(9201052,'7/17/2012'),
(9201052,'7/17/2012'),
(9201052, 7/18/2012'),
(7949745,'7/17/2012'),
(7949745,'7/18/2012'),
(7949745,'7 / 23/2012'),
(7949745,7/23/2012),
(7949745,'7/24/2012'),
(7949745,'7/26 / 2012年7月26日),
(7949745,'7/26/2012'),
(7949745,'8/8/2012'),
(7949745,'8/8/2012' ),
(7949745,'11 / 5/2012'),
(7949745,'11 / 5/2012'),
(7949745,'11 / 5/2012'
(7949745,'11 / 6/2012'),
(7949745,'11 / 6/2012'),
(7949745,'11 / 6/2012'),
(7949745,'11 / 7/2012'),
(7949745,'11 / 7/2012'),
(7949745,'11 / 7/2012')

这是包含注册日期。

  CREATE TABLE [状态](
ID int,
Intake_Date datetime,
Engaged_Date datetime ,
Enrolled_Date datetime)
INSERT INTO [状态] VALUES
(7949745,'3/7/2012','7/17/2012','3/8/2012'),
(4504498,'2/21/2013','3/5/2013','3/22/2013'),
(1486279,'4/18/2013','5 / 7/2013','5/20/2013'),
(9201052,'5/15/2012','7/13/2012','5/15/2012'),
(1722390,'3/5/2012','8/27/2012','3/8/2012'),
(7735695,'9/7/2012','9/7/2012' ,'9/28/2012'),
(9261549,'3/7/2012','7/24/2012','3/8/2012'),
(3857008, 3/15/2013','3/18/2013','4/3/2013'),
(8502583,'3/14/2013','4/15/2013','5 / 3/2013'),
(1209774,'4/19/2012','1/1/2012','4/24/2012')

这是包含注销日期。

  CREATE TABLE Deenrolled(
ID int,
Deenrolled_Date datetime)
INSERT INTO Deenrolled VALUES
(7949745,'2/4/2013'),
(5485272,'07 / 08/2013'),
(8955628,'01 / 10/2013'),
(5123221,'7/8/2013'),
(5774753,'7/18/2013'),
(3005451,'2/18/2013'),
(7518818,'05 / 29/2013'),
(9656985,'6/20/2013'),
(2438101,'7/17/2013'),
(1437052 ,'7/25/2013'),
(9133874,'4/25/2013'),
(7007375,'6/19/2013'),
(3178181, 5/24/2013')

不活动

  CREATE TABLE Inactive(
ID int,
Effect_Date datetime)
INSERT INTO非活动值
(1209774,'10 / 12 / 2012年),
(5419494,'10 / 12/2012'),
(4853049,'10 / 9/2012'),
(1453678,'5/23/2013' ),
(1111554,'7/16/2012'),
(5564128 ,'2/15/2013'),
(1769234,'7/16/2012')


解决方案

我应该说这不是一件容易的事情。主要的问题是每周至少解决两次,这是六个月的时间 - 每周很容易计算两次,但应该是6个月!


当我试过为了解决这个问题,我发现了 Niels van der Rest 的绝对辉煌的答案 - 在一组数字中查找连续范围。所以我会给你一般的查询第1部分,你可以更改参数并获得第2部分的结果:

  declare @Weeks int,@PerWeek int,@StartDate date,@EndDate date,@count 

select
@StartDate ='20120701' ,
@EndDate ='20130630',
@Weeks = 26, - 6个月或26周
@PerWeek = 2 - 每周两次

选择@count = count(不同的A.ID)
从Attendance作为A
其中
A.Attendence_date在@StartDate和@EndDate和
之间A.ID不在(选择T. ID从Deenrolled为T)和
A.ID不在(选择T.ID从Inactive为T)

;与CTE为(
- 周数,过滤器日期
选择
A.ID,
datediff(dd,@StartDate,A.Attendence_date)/ 7作为Wk
从Attendance作为A
其中
A.Attendence_date @StartDate和@EndDate和
A.ID不在(选择T.ID from D
A.ID不在(从不活动的T选择T.ID
),CTE2为(
- 按星期分组,每周过滤@PerWeek ,计算行号
选择
Wk,ID,
row_number()over(按照Wk的ID排序分区)作为Row_Num
从CTE
组由Wk,ID
有count(*)> = @PerWeek

- 最终查询 - 按星期和row_number之间的差异组b $ b选择100 * cast(count(distinct ID)as as浮点数)/ @count
从CTE2
组按ID,Wk - Row_Num
有count(*)> = @Weeks

我创建了 SQL FIDDLE示例 ,您可以测试查询。



第3部分很简单

  declare @PerWeek int,@StartDate date 

select
@StartDate ='20130101',
@ PerWeek = 2 - 每周两次

选择@count = count(不同的A.ID)
从Attendance作为A
其中
A.Attendence_date> = @StartDate和
A.ID不在(从Deenrolled中选择T.ID作为T)和
A.ID不在(从不活动的T中选择T.ID)

;以CTE为(
- 周数,按日期过滤
选择
A.ID,
datediff(dd,@StartDate,A.Attendence_date)/ 7作为Wk
从Attendance作为A
其中
A.Attendence_date > = @StartDate和
A.ID不在(选择TID从Deenrolled为T)和
A.ID不在(从不活动的T选择T.ID
) ,CTE2为(
- 按周分组,过滤器少于@PerWeek每周
从CTE
组中选择不同ID
由Wk,ID
计数( *)> = @PerWeek

选择100 * cast(count(*)as float)/ @count from CTE2

第4部分似乎对我来说有点不清楚,你能澄清一下吗?


I am a school teacher very new to MS SQL server. Everyone is suggesting to try this site out. Here goes!

I am trying to write queries to test different types of outcome measures for the participation in a academic program. There are several different ways to calculate this outcome measurement I would like to try. The outcome which I am trying to Calculate is: What is the % of participants that are retained during six months of the program? I am testing different ways to define participant and different time ranges. There are 4 queries I am trying to produce. Unfortunately, I have to use for different tables: Attendance, Status, Deenrolled, Inactive. I have included sample data from each below

Queries

  1. A participant is defined as everyone that attended a class at least twice a week for 6 months (181 days total) starting at July 1st 2012 and Ending June 30th 2013, so the length of the fiscal year.If a participant is deenrolled or inactive they are dropped.
  2. A participant is defined as everyone that attended a class at least twice a week for 6 months (181 days total) starting at January 1st 2013. If a participant is deenrolled or becomes inactive they are dropped.
  3. A participant is defined as everyone that attended a class at least twice a week starting at January 1st 2013 until today
  4. A participant is defined as a student's enrollment start date until they are deenrolled or become inactive.

Participant (Numerator) participant / all students which were served (Denominator)

The 4 query outputs I am looking for are different versions of this:

Example

Participants    Served   Percent_Served
75               100        75%      

I have been messing around with different versions of the query below

 SELECT 
Count (distinct ID) as Count, 
  Count  ( DATEADD( dd, -181, DATEADD(wk, DATEDIFF(wk,0,Date), 0)) > 2 as Participants ,
FROM Attendance
where Attendence_date date between '07/01/2012' and '06/30/2013'
and ID not in (Select ID from Inactive) 
or ID not in (select ID from Deenrolled) 
GROUP BY ID

and

 SELECT 
Count (distinct ID) as Count, 
  Count  ( DATEADD( dd, -181, DATEADD(wk, DATEDIFF(wk,0,Date), 0)) - Enrolled_Date  as Participants ,
FROM Attendance
where Attendence_date date between '07/01/2012' and '06/30/2013'
and ID not in (Select ID from Inactive) 
or ID not in (select ID from Deenrolled) 
GROUP BY ID

Any programming assistance for these queries is greatly appreciated.

Below are the sample/example datasets.

Attendence_date is the date a student participated in one class.

CREATE TABLE Attendance (
    ID int,
    Attendence_date datetime
    )

INSERT INTO Attendance VALUES 
(4504498,  '7/1/2012'),
(4504498,  '7/2/2012'),
(4504498,   '7/3/2012'),
(4504498,   '7/4/2012'),
(4504498,   '7/5/2012'),
(4504498,   '7/8/2012'),
(4504498,   '7/9/2012'),
(4504498,   '7/10/2012'),
(4504498,   '7/11/2012'),
(4504498,   '7/12/2012'),
(4504498,   '7/1/2012'),
(4504498,   '7/2/2012'),
(4504498,   '7/3/2012'),
(4504498,   '7/4/2012'),
(4504498,   '7/5/2012'),
(4504498,   '7/8/2012'),
(4504498,   '7/9/2012'),
(4504498,   '7/10/2012'),
(4504498,   '7/11/2012'),
(4504498,   '7/12/2012'),
(9201052,   '7/15/2012'),
(9201052,   '7/16/2012'),
(9201052,   '7/17/2012'),
(9201052,   '7/17/2012'),
(9201052,   '7/18/2012'),   
(7949745,   '7/17/2012'),   
(7949745,   '7/18/2012'),
(7949745,   '7/23/2012'),   
(7949745,   '7/23/2012'),   
(7949745,   '7/24/2012'),
(7949745,   '7/26/2012'),
(7949745,   '7/26/2012'),   
(7949745,   '8/8/2012'),    
(7949745,   '8/8/2012'),    
(7949745,   '11/5/2012'),   
(7949745,   '11/5/2012'),   
(7949745,   '11/5/2012'),   
(7949745,   '11/6/2012'),   
(7949745,   '11/6/2012'),   
(7949745,   '11/6/2012'),   
(7949745,   '11/7/2012'),   
(7949745,   '11/7/2012'),   
(7949745,   '11/7/2012')

Here is the contains the enrollment date.

CREATE TABLE [Status] (
    ID int,
    Intake_Date datetime ,
   Engaged_Date datetime ,
   Enrolled_Date datetime)
INSERT INTO [Status] VALUES 
(7949745, '3/7/2012',   '7/17/2012', '3/8/2012'),
(4504498, '2/21/2013',  '3/5/2013',  '3/22/2013'),
(1486279, '4/18/2013',  '5/7/2013',   '5/20/2013'),
(9201052, '5/15/2012',  '7/13/2012',  '5/15/2012'),
(1722390, '3/5/2012',   '8/27/2012', '3/8/2012'),
(7735695, '9/7/2012',   '9/7/2012',  '9/28/2012'),
(9261549, '3/7/2012',   '7/24/2012', '3/8/2012'),
(3857008, '3/15/2013',  '3/18/2013', '4/3/2013'),
(8502583, '3/14/2013',     '4/15/2013', '5/3/2013'),
(1209774,  '4/19/2012',  '1/1/2012',   '4/24/2012') 

Here is the contains the de-enrollment date.

CREATE TABLE Deenrolled (
    ID int,
    Deenrolled_Date datetime)
INSERT INTO Deenrolled  VALUES 
(7949745,    '2/4/2013'),
(5485272,    '07/08/2013'),
(8955628,    '01/10/2013'),
(5123221,    '7/8/2013'),
(5774753,    '7/18/2013'),
(3005451,    '2/18/2013'),
(7518818,    '05/29/2013'),
(9656985,    '6/20/2013'),
(2438101,    '7/17/2013'),
(1437052,    '7/25/2013'),
(9133874,    '4/25/2013'),
(7007375,    '6/19/2013'),
(3178181,    '5/24/2013')

And inactive

CREATE TABLE Inactive (
    ID int,
   Effect_Date datetime)
INSERT INTO Inactive VALUES 
(1209774,       '10/12/2012'),
(5419494,       '10/12/2012'),
(4853049,       '10/9/2012'),
(1453678,       '5/23/2013'),
(1111554,       '7/16/2012'),
(5564128,       '2/15/2013'),
(1769234,       '7/16/2012')

解决方案

Well I should say that is not an easy one. The main problem was to solve 'at least twice a week for a sixth months' part - it's easy to calculate twice a week, but it should be 6 continous months!

While I've tried to solve it, I've found absolutely brilliant answer by Niels van der Rest - Finding continuous ranges in a set of numbers. So I'll give you general query for the Part 1, you can change parameters and get result for Part 2:

declare @Weeks int, @PerWeek int, @StartDate date, @EndDate date, @count

select
    @StartDate = '20120701',
    @EndDate = '20130630',
    @Weeks = 26, -- 6 month or 26 weeks
    @PerWeek = 2 -- twice per week

select @count = count(distinct A.ID)
from Attendance as A
where
    A.Attendence_date between @StartDate and @EndDate and
    A.ID not in (select T.ID from Deenrolled as T) and
    A.ID not in (select T.ID from Inactive as T)

;with CTE as (
    -- Week numbers, filter by dates
    select
        A.ID,
        datediff(dd, @StartDate, A.Attendence_date) / 7 as Wk
    from Attendance as A
    where
        A.Attendence_date between @StartDate and @EndDate and
        A.ID not in (select T.ID from Deenrolled as T) and
        A.ID not in (select T.ID from Inactive as T)
  ), CTE2 as (
    -- Group by week, filter less then @PerWeek per week, calculate row number
    select
        Wk, ID,
        row_number() over (partition by ID order by Wk) as Row_Num
    from CTE
    group by Wk, ID
    having count(*) >= @PerWeek
)
-- Final query - group by difference between week and row_number
select 100 * cast(count(distinct ID) as float) / @count
from CTE2
group by ID, Wk - Row_Num
having count(*) >= @Weeks

I've created SQL FIDDLE EXAMPLE, you can test the query.

Part 3 is easy

declare @PerWeek int, @StartDate date

select
    @StartDate = '20130101',
    @PerWeek = 2 -- twice per week

select @count = count(distinct A.ID)
from Attendance as A
where
    A.Attendence_date >= @StartDate and
    A.ID not in (select T.ID from Deenrolled as T) and
    A.ID not in (select T.ID from Inactive as T)

;with CTE as (
    -- Week numbers, filter by dates
    select
        A.ID,
        datediff(dd, @StartDate, A.Attendence_date) / 7 as Wk
    from Attendance as A
    where
        A.Attendence_date >= @StartDate and
        A.ID not in (select T.ID from Deenrolled as T) and
        A.ID not in (select T.ID from Inactive as T)
  ), CTE2 as (
    -- Group by week, filter less then @PerWeek per week
    select distinct ID
    from CTE
    group by Wk, ID
    having count(*) >= @PerWeek
)
select 100 * cast(count(*) as float) / @count from CTE2

Part 4 seems a bit unclear for me, could you clarify?

这篇关于在六个月内保留的参与者的百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆