分组前80%的类别 [英] Grouping of top 80% categories

查看:120
本文介绍了分组前80%的类别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要SQL查询进行分组,这些分类只包含那些总共包含至少80%所有类别的组,其他罕见类别(包含总数的20%)应该表示为其他



因此,按分类颜色对苹果进行分组的查询结果应该如下所示:

  RED 1118 44%)
黄色711 28%>至少80%
绿色229 9%)
其他482 19%

如何做到这一点?

解决方案

我会用聚合和分析功能的组合来做到这一点。当稀有物的累积总和低于20%时,颜色被放入其他类别:

  select(case when cumcntdesc  else color 
end)as color,sum(cnt)as cnt
from(select color,count(*)as cnt,$ b $作为cumcntdesc,
sum(count(*))over()的总数(count(*))over(count by(*)asc)as totalcnt
from t
group by color
)t
group by(cumcntdesc< totalcnt * 0.2 then then'other'
else color
end)
pre>

此处是一个SQL小提琴。


I need SQL query for grouping by some category which would present quantities of only those groups which in total contain at least 80% of all categories, other rare categories (containing up to 20% of total) should be represented like "other".

So the result of such a query for grouping apples by category color should look like this:

RED    1118 44% )
YELLOW  711 28% > at least 80%
GREEN   229  9% )
other   482 19%

How to do that?

解决方案

I would do this with a combination of aggregation and analytic functions. The colors are put in the "other" category when the cumulative sum of the rarest is under 20%:

select (case when cumcntdesc < totalcnt * 0.2 then 'other'
             else color
        end) as color, sum(cnt) as cnt
from (select color, count(*) as cnt,
             sum(count(*)) over (order by count(*) asc) as cumcntdesc,
             sum(count(*)) over () as totalcnt
      from t
      group by color
     ) t
group by (case when cumcntdesc < totalcnt * 0.2 then 'other'
               else color
          end)

Here is a SQL Fiddle.

这篇关于分组前80%的类别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆