计算不同记录的窗口函数 [英] Window functions to count distinct records

查看:24
本文介绍了计算不同记录的窗口函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

下面的查询基于一个复杂的视图,并且该视图按我的意愿工作(我不打算包含该视图,因为我认为它对解决手头的问题没有帮助).我无法理解的是 drugCountsinFamilies 列.我需要它来显示每个药物系列的 distinct drugName 的数量.您可以从第一个屏幕截图中看到有三个不同的 H3A 行.H3A 的 drugCountsInFamilies 应该是 3(有三种不同的 H3A 药物.)

The query below is based on a complicated view and the view works as I want it to (I'm not going to include the view because I don't think it will help with the question at hand). What I can't get right is the drugCountsinFamilies column. I need it to show me the number of distinct drugNames for each drug family. You can see from the first screencap that there are three different H3A rows. The drugCountsInFamilies for H3A should be 3 (there are three different H3A drugs. )

您可以从第二个屏幕截图中看到,第一个屏幕截图中的 drugCountsInFamilies 正在捕获列出药物名称的行数.

You can see from the second screen cap that what's happening is the drugCountsInFamilies in the first screen cap is catching the number of rows that the drug name is listed on.

下面是我的问题,对不正确的部分进行评论

Below is my question, with comments on the part that is incorrect

select distinct
     rx.patid
    ,d2.fillDate
    ,d2.scriptEndDate
    ,rx.drugName
    ,rx.drugClass
    --the line directly below is the one that I can't figure out why it's wrong
    ,COUNT(rx.drugClass) over(partition by rx.patid,rx.drugclass,rx.drugname) as drugCountsInFamilies
from 
(
select 
    ROW_NUMBER() over(partition by d.patid order by d.patid,d.uniquedrugsintimeframe desc) as rn
    ,d.patid
    ,d.fillDate
    ,d.scriptEndDate
    ,d.uniqueDrugsInTimeFrame
    from DrugsPerTimeFrame as d
)d2
inner join rx on rx.patid = d2.patid
inner join DrugTable as dt on dt.drugClass=rx.drugClass
where d2.rn=1 and rx.fillDate between d2.fillDate and d2.scriptEndDate
and dt.drugClass in ('h3a','h6h','h4b','h2f','h2s','j7c','h2e')
order by rx.patid

如果我尝试在 count(rx.drugClass) 子句中添加一个不同的内容,则 SSMS 会生气.可以用窗口函数来完成吗?

SSMS gets mad if I try to add a distinct to the count(rx.drugClass) clause. Can it be done using window functions?

推荐答案

count(distinct) 作为 Windows 函数执行需要一个技巧.实际上有几个级别的技巧.

Doing a count(distinct) as a windows function requires a trick. Several levels of tricks, actually.

因为您的请求实际上非常简单——该值始终为 1,因为 rx.drugClass 在分区子句中——我将做一个假设.假设您想计算每个患者的独特药物类别的数量.

Because your request is actually truly simple -- the value is always 1 because rx.drugClass is in the partitioning clause -- I will make an assumption. Let's say you want to count the number of unique drug classes per patid.

如果是这样,请执行由 patid 和 drugClass 分区的 row_number().当此值为 1 时,在 patid 中,则开始一个新的药物类.创建一个标志,在这种情况下为 1,在所有其他情况下为 0.

If so, do a row_number() partitioned by patid and drugClass. When this is 1, within a patid, , then a new drugClass is starting. Create a flag that is 1 in this case and 0 in all other cases.

然后,您可以简单地使用分区子句执行 sum 以获得不同值的数量.

Then, you can simply do a sum with a partitioning clause to get the number of distinct values.

查询(格式化后,以便我可以阅读),看起来像:

The query (after formatting it so I can read it), looks like:

select rx.patid, d2.fillDate, d2.scriptEndDate, rx.drugName, rx.drugClass,
       SUM(IsFirstRowInGroup) over (partition by rx.patid) as NumDrugCount
from (select distinct rx.patid, d2.fillDate, d2.scriptEndDate, rx.drugName, rx.drugClass,
             (case when 1 = ROW_NUMBER() over (partition by rx.drugClass, rx.patid order by (select NULL))
                   then 1 else 0
              end) as IsFirstRowInGroup
      from (select ROW_NUMBER() over(partition by d.patid order by d.patid,d.uniquedrugsintimeframe desc) as rn, 
                   d.patid, d.fillDate, d.scriptEndDate, d.uniqueDrugsInTimeFrame
            from DrugsPerTimeFrame as d
           ) d2 inner join
           rx
           on rx.patid = d2.patid inner join
           DrugTable dt
           on dt.drugClass = rx.drugClass
      where d2.rn=1 and rx.fillDate between d2.fillDate and d2.scriptEndDate and
            dt.drugClass in ('h3a','h6h','h4b','h2f','h2s','j7c','h2e')
     ) t
order by patid

这篇关于计算不同记录的窗口函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆