获得每个类别的十大产品 [英] Get top 10 products for every category

查看:93
本文介绍了获得每个类别的十大产品的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个类似这样的查询

I have a query which is something like this

SELECT 
t.category, 
tc.product, 
tc.sub-product,
 count(*) as sales 
 FROM tg t, ttc tc
 WHERE t.value = tc.value
 GROUP BY t.category, tc.product, tc.sub-product;

现在,在我的查询中,我想获得每个类别的前10名产品(按销售量排名最高),并且每个类别我需要前5个子类别(按销售量排名第一)

Now in my query I want to get top 10 products for every category (top by sales ) and for every category I need top 5 sub category (top by sales)

您可以假设问题陈述是这样的:

You can assume the problem statement as something like this :

按销售获得每个类别的前10个产品,按销售获得每个前5个子产品.

Get top 10 products for each category by sales and for each product get top 5 sub-products by sales .

  • 这里的类别可以是图书
  • 产品可以是Harry Porter的书
  • 子产品可以是HarryPorter系列5

样本输入数据格式

category |product |subproduct |Sales [count (*)]

abc   test1    test11     120

abc   test1    test11     100

abc   test1    test11     10

abc   test1    test11     10

abc   test1    test11     10

abc   test1    test11     10

abc   test1    test12     10

abc   test1    test13     8

abc   test1    test14     6

abc   test1    test15     5

abc   test2    test21     80

abc   test2    test22     60

abc   test3    test31     50

abc   test3    test32     40

abc   test4    test41     30

abc   test4    test42     20

abc   test5    test51     10

abc   test5    test52     5 

abc   test6    test61     5 

|

|

|

bcd   test2    test22     10 

xyz   test3    test31     5 

xyz   test3    test32     3 

xyz   test4    test41     2

输出为"

top 5 rf for (abc) -> abc,test1(289) abc,test2 (140), abc test3 (90), abc test4(50) , abc test5 (15)

top 5 rfm for (abc,test1) -> test11(260),test12(10),test13(8),test14(6),test15(5) and so on

我的查询失败,因为结果确实很大.我正在阅读有关诸如rank之类的oracle分析功能的信息.有人可以帮我使用解析函数修改此查询.任何其他方法也可以使用.

My query is failing because results are really huge . I am reading about oracle analytic functions like rank. Can someone help me modifying this query using analytical functions. Any other approach can also work.

我指的是 http://www.orafaq.com/node/55 .但是无法为此获得正确的sql查询.

I am referring to this http://www.orafaq.com/node/55. But unable to get a right sql query for this.

任何帮助将不胜感激..我喜欢在此停留两天:(

Any help would be appreciated..I am like stuck for 2 days on this :(

推荐答案

可能不使用解析函数,而是单独使用解析函数的原因:

There are probably reasons not to use analytical functions, but using analytical functions alone:

select am, rf, rfm, rownum_rf2, rownum_rfm
from
(
    -- the 3nd level takes the subproduct ranks, and for each equally ranked
    -- subproduct, it produces the product ranking
    select am, rf, rfm, rownum_rfm,
      row_number() over (partition by rownum_rfm order by rownum_rf) rownum_rf2
    from
    (
        -- the 2nd level ranks (without ties) the products within
        -- categories, and subproducts within products simultaneosly
        select am, rf, rfm,
          row_number() over (partition by am order by count_rf desc) rownum_rf,
          row_number() over (partition by am, rf order by count_rfm desc) rownum_rfm
        from
        (
            -- inner most query counts the records by subproduct
            -- using regular group-by. at the same time, it uses
            -- the analytical sum() over to get the counts by product
            select tg.am, ttc.rf, ttc.rfm,
              count(*) count_rfm,
              sum(count(*)) over (partition by tg.am, ttc.rf) count_rf
            from tg inner join ttc on tg.value = ttc.value
            group by tg.am, ttc.rf, ttc.rfm
        ) X
    ) Y
    -- at level 3, we drop all but the top 5 subproducts per product
    where rownum_rfm <= 5   -- top  5 subproducts
) Z
-- the filter on the final query retains only the top 10 products
where rownum_rf2 <= 10  -- top 10 products
order by am, rownum_rf2, rownum_rfm;

我使用rownum而不是等级,因此您永远不会获得联系,换句话说,联系将是随机决定的.如果数据不够密集(前10个产品中的任何5个子产品少于-可能显示其他产品的子产品),这也将不起作用.但是,如果数据密集(建立的数据库很大),则查询应该可以正常工作.

I used rownum instead of rank so you don't ever get ties, or in other words, ties will be randomly decided. This also doesn't work if the data is not dense enough (less than 5 subproducts in any of the top 10 products - it may show subproducts from some other products instead). But if the data is dense (large established database), the query should work fine.


下面的数据进行了两次传递,但在每种情况下均返回正确的结果.同样,这是一个无联系等级查询.


The below makes two passes of the data, but returns correct results in each case. Again, this is a rank-without-ties query.

select am, rf, rfm, count_rf, count_rfm, rownum_rf, rownum_rfm
from
(
    -- next join the top 10 products to the data again to get
    -- the subproduct counts
    select tg.am, tg.rf, ttc.rfm, tg.count_rf, tg.rownum_rf, count(*) count_rfm,
        ROW_NUMBER() over (partition by tg.am, tg.rf order by 1 desc) rownum_rfm
    from (
        -- first rank all the products
        select tg.am, tg.value, ttc.rf, count(*) count_rf,
            ROW_NUMBER() over (order by 1 desc) rownum_rf
        from tg
        inner join ttc on tg.value = ttc.value
        group by tg.am, tg.value, ttc.rf
        order by count_rf desc
        ) tg
    inner join ttc on tg.value = ttc.value and tg.rf = ttc.rf
    -- filter the inner query for the top 10 products only
    where rownum_rf <= 10
    group by tg.am, tg.rf, ttc.rfm, tg.count_rf, tg.rownum_rf
) X
-- filter where the subproduct rank is in top 5
where rownum_rfm <= 5
order by am, rownum_rf, rownum_rfm;

列:

count_rf : count of sales by product
count_rfm : count of sales by subproduct
rownum_rf : product rank within category (rownumber - without ties)
rownum_rfm : subproduct rank within product (without ties)

这篇关于获得每个类别的十大产品的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆