如何使用SQL计算一列中非连续值的数量? [英] How to count number of non-consecutive values in a column using SQL?
问题描述
Following-up on my question here. Say I have a table in an Oracle database like the one below (table_1) which tracks service involvement for a particular individual:
name day srvc_ inv
bill 1 1
bill 2 1
bill 3 0
bill 4 0
bill 5 1
bill 6 0
susy 1 1
susy 2 0
susy 3 1
susy 4 0
susy 5 1
我的目标是获得一个汇总表,该表针对所有唯一的个人列出是否涉及服务,以及不同服务事件的数量(在这种情况下,账单为2,可疑为3),其中通过几天的活动中断来识别.
My goal is to get a summary table which lists, for all unique individuals, whether there was service involvement and the number of distinct service episodes (in this case 2 for bill and 3 for susy), where a distinct service episode is identified by a break in activity over days.
要获得任何服务的参与,我将使用以下查询
To get any service involvement, I would use the following query
SELECT table_1."Name", MAX(table_1."Name") AS "any_invl"
FROM table_1
GROUP BY table_1."Name"
但是,我对如何获得服务参与数感到困惑(2).在R中使用静态数据帧,您将使用游程长度编码(请参阅我的原始问题),但是我不知道如何在SQL中完成此操作.此操作将在大量记录上运行,因此将整个数据帧存储为对象然后在R中运行它是不切实际的.
However, I'm stuck as to how I would get the number of service involvements (2). Using a static dataframe in R, you would use run length encoding (see my original question), but I don't know how I could accomplish this in SQL. This operation would be run over a large number of records so it would be impractical to store the entire data frame as an object and then run it in R.
我的期望输出如下:
name any_invl n_srvc_inv
bill 1 2
susy 1 3
感谢您的帮助!
推荐答案
我建议使用lag()
.这个想法是要计算一个"1",但仅当前一个值为零或null
时:
I would suggest using lag()
. The idea is to count a "1", but only when the preceding value is zero or null
:
select name, count(*)
from (select t.*,
lag(srvc_inv) over (partition by name order by day) as prev_srvc_inv
from t
) t
where (prev_srvc_inv is null or prev_srvc_inv = 0) and
srvc_inv = 1
group by name;
您可以使用lag()
的默认值来简化此操作:
You can simplify this a little by using a default value for lag()
:
select name, count(*)
from (select t.*,
lag(srvc_inv, 1, 0) over (partition by name order by day) as prev_srvc_inv
from t
) t
where prev_srvc_inv = 0 and srvc_inv = 1
group by name;
这篇关于如何使用SQL计算一列中非连续值的数量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!