MySQL条件自连接 [英] MySQL conditional self join

查看:189
本文介绍了MySQL条件自连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

CREATE TABLE dialoglog (
   userid int,
   appid int,
   type varchar(100) ,
   ds timestamp
 );

数据

userid  appid   type    ds
1        1      imp     January, 19 2016 03:14:07
1        1      click   January, 20 2016 03:14:07
2        1      imp     January, 19 2016 03:14:07
2        1      click   January, 18 2016 03:14:07
3        6       imp    January, 19 2016 03:14:07
10       1      click   January, 19 2016 03:14:07
1        1       imp    January, 20 2016 03:12:07

问题

每个appid的点击率是多少?

Question

What is the CTR (click/imp) per appid ?

我们要按组打印点击率

问题在于我们有不良"数据:

Problem is that we have "bad" data:

  • 用户10单击appid 1,但没有对appid 1的印象=>我们应该忽略它
  • 用户2在对appid 1有印象之前点击了appid 1 =>我们应该忽略它

因此,这样的查询:

SELECT SUM(CASE WHEN type='click' THEN 1 else 0 END)*1.0/SUM(CASE WHEN dialoglog.type='imp' THEN 1 else 0 END) as ctr,appid 
     FROM  dialoglog GROUP BY  appid;

由于我刚才所说的限制而无法工作. 是否可以在单个mysql查询中编写所有这些约束代码?

Would not work because of the constrain I just said. Is it possible to code all these constrain in a single mysql query?

这是 sqlfiddle 可以玩

对于appid 1,我们有3次展示和1次点击(这是有效的,因为对于用户2的时间戳单击了<时间戳的印象,而对于用户10则没有印象),因此我们的点击率应为1/3.

For appid 1 we have 3 impression and 1 click (that is valid since for user 2 timestamp clicked < timestamp impression and for user 10 there was no impression) therefore we should have a CTR of 1/3.

对于appid 6,我们只有一次展示=> CTR = 0

For appid 6, we only have one impression => CTR = 0

因此,我期望得到这样的结果:

Therefore I am expecting this result:

appid  CTR
1      0.333
6      0

到目前为止我尝试过的事情

我考虑过进行左连接,但是问题是它将复制右表:

What I tried so far

I thought about doing left join but the problem is that it will duplicate the right table:

SELECT * FROM dialoglog t1 LEFT JOIN dialoglog t2 on t1.userid=t2.userid and t1.appid=t2.appid and t1.type<>t2.type WHERE t1.type="imp" and t1.ds<t2.ds;



userid  appid   type    ds                   userid appid   type    ds
1        1      imp January, 19 2016 03:14:07   1    1    click January, 20 2016 03:14:07
1        1      imp January, 19 2016 03:12:07   1    1    click January, 20 2016 03:14:07

推荐答案

您似乎需要为每个用户提供最早的"imp".然后,您可以按该时间点之后的交易进行汇总:

You seem to need the earliest "imp" for each user. Then you can aggregate by the transactions after that point in time:

SELECT dl.appid, SUM(dl.type = 'click') / SUM(dl.type = 'imp')
FROM dialoglog dl LEFT JOIN
     (SELECT dl2.userId, MIN(ds) as imp_ds
      FROM dialoglog dl2
      WHERE type = 'imp'
      GROUP BY dl2.userid
     ) dlimp
     ON dl.userid = dlimp.userid 
WHERE dl.ds >= dlimp.imp_ds
GROUP BY dl.appid;

这篇关于MySQL条件自连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆