每个组中值的最高出现 [英] Highest appearance of a value within each group

查看:86
本文介绍了每个组中值的最高出现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一张带有用户ID和他浏览过的Webside-ID的表.该表如下所示:

I'm having a table with an user-ID and the Webside-IDs he navigated through. The table looks like this:

| user-ID | website-ID |
|       1 |          1 |
|       1 |          2 |
|       1 |          1 |
|       1 |          4 |
|       2 |          1 |
|       2 |          2 |
|       2 |          5 |
|       2 |          2 |

我想选择每个用户最常访问的网站ID.如您所见,最小最大平均在这里不起作用.有任何想法吗?

I'd like to select the website-ID each user has been into the most. As you can see, a min, max or avg won't work here. Any ideas?

推荐答案

这可能看起来很复杂,但只会仅一次汇总数据,然后对结果进行排名并仅选择第一个一个

This may look complicated but it will only aggregate the data once only, then pass through it ranking the results and choosing only the first one

select userid, websiteid, visits
from
(
select 
 userid, websiteid, visits,
 @r := case when @u=userid then @r+1 else 1 end r,
 @u := userid
from
(select @u:=null) x,
(select userid, websiteid, count(*) visits
 from visit
 group by userid, websiteid
 order by userid, visits desc) y
) z
where r=1

内部选择为每个用户-网站组合生成计数,并按访问次数排序.然后通过中间查询对记录per user进行排名,并在列r中给出排名.

The inner select generates counts for each user-website combination and orders it by the most visits. This is then passed through the middle query that ranks the records per user, giving the rank in the column r.

这是一个变体,它将显示具有相等排名的用户的所有网站.与上一个查询的区别在于,如果网站A和B都有来自用户X的10次访问,则结果中将同时列出A和B,而前一个查询则随机选择一个来显示.

This is a variation, which will show ALL websites for a user that have EQUAL rank. The difference with the previous query is that if websites A and B both have 10 visits from user X, both A and B are listed in the result whereas the previous query selects one randomly to show.

select userid, websiteid, visits
from
(
select 
 userid, websiteid, visits,
 @r := case
    when @u=userid and @v=visits then @r  # same rank
    when @u=userid then @r+1              # next rank
    else 1                                # different user
    end r,
 @u := userid, @v := visits
from
(select @u:=null, @v:=null) x,
(select userid, websiteid, count(*) visits
 from visit
 group by userid, websiteid
 order by userid, visits desc) y
) z
where r=1

这是使用的测试表

create table visit (userid int, websiteid int);
insert into visit select 1,1;
insert into visit select 1,2;
insert into visit select 1,1;
insert into visit select 1,4;
insert into visit select 2,1;
insert into visit select 2,2;
insert into visit select 2,5;
insert into visit select 2,2;

这篇关于每个组中值的最高出现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆