使用LEFT JOIN和SELF JOIN以及汇总函数计算accepting_ratio [英] Calculating acceptance_ratio with LEFT JOIN and SELF JOIN and aggregate function

查看:89
本文介绍了使用LEFT JOIN和SELF JOIN以及汇总函数计算accepting_ratio的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试从连接"表计算每日接受率,该表有4个带有样本值的字段:

Trying to calculate daily acceptance ratios from the 'connecting' table which has 4 fields with sample values:

date          action         sender_id        recipient_id 
'2017-01-05', 'request_link', 'frank', 'joe' 
'2017-01-06', 'request_link', 'sally', 'ann' 
'2017-01-07', 'request_link', 'bill', 'ted' 
'2017-01-07', 'accept_link', 'joe', 'frank' 
'2017-01-06', 'accept_link', 'ann', 'sally' 
'2017-01-06', 'accept_link', 'ted', 'bill' 

由于01-05上有0个接受和1个请求,因此其每日接受比率应为0/1 =0.类似地,01-06的比率应为2/1,而对于01-05的比率应为1/1 01-07.

Because there are 0 accepts and 1 request on 01-05, its daily acceptance ratio should be 0/1 = 0. Similarly, the ratio for 01-06 should be 2/1, and it should be 1/1 for 01-07.

但是重要的是,每个accept_link都有一个对应的request_link,其中request_link的sender_id = accept_link的receive_id(反之亦然).因此,我认为在这里需要进行自我加入,以确保无论日期如何,乔都接受弗兰克的要求.

It is important however that each accept_link has a corresponding request_link where the sender_id of the request_link = the recipient_id of the accept_link (and vice versa). So here a self-join is required I believe to ensure that Joe accepts Frank's request, regardless of the date.

如何更正以下查询,以便聚合在保留所需的加入条件的同时正确地工作?如果删除了两个WHERE条件,或者是否有必要,查询是否可以按原样正确计算?

How can the below query be corrected so that the aggregation works correctly while retaining the required join conditions? Will the query calculate correctly as is if the two WHERE conditions are removed, or are they necessary?

SELECT f1.date, 
    SUM(CASE WHEN f2.action = 'accept_link' THEN 1 ELSE 0 END) /
    SUM(CASE WHEN f2.action = 'request_link' THEN 1 ELSE 0 END) AS acceptance_ratio
FROM connecting f1
LEFT JOIN connecting f2
ON f1.sender_id = f2.recipient_id
LEFT JOIN connecting f2
ON f1.recipient_id = f2.sender_id
WHERE f1.action = 'request_link'
AND f2.action = 'accept_link'
GROUP BY f1.date
ORDER BY f1.date ASC

预期的输出应类似于:

date          acceptance_ratio
'2017-01-05'  0.0000
'2017-01-06'  2.0000
'2017-01-07'  1.0000

谢谢.

推荐答案

再一次,我认为您不需要在此处使用自连接.相反,只需对整个表使用条件聚合,并计算每天发生的请求和接受的数量:

Once again, I don't think you need to be using a self join here. Instead, just use conditional aggregation over the entire table, and count the number of requests and accepts which happened on each day:

SELECT t.date,
       CASE WHEN t.num_requests = 0
            THEN 'No requests available'
            ELSE CAST(t.num_accepts / t.num_requests AS CHAR(50))
       END AS acceptance_ratio
FROM
(
    SELECT c1.date,
           SUM(CASE WHEN c1.action = 'accept_link' AND c2.action IS NOT NULL
                    THEN 1 ELSE 0 END) AS num_accepts,
           SUM(CASE WHEN c1.action = 'request_link' THEN 1 ELSE 0 END) AS num_requests
    FROM connecting c1
    LEFT JOIN connecting c2
        ON c1.action       = 'accept_link'   AND
           c2.action       = 'request_link'  AND
           c1.sender_id    = c2.recipient_id AND
           c2.recipient_id = c1.sender_id
    GROUP BY c1.date
) t
ORDER BY t.date

请注意,在这里我使用CASE表达式来处理被零除的情况,如果某天没有请求,这种情况可能会发生.我在这里还假设同一邀请不会被多次发送.

Note here that I use a CASE expression to handle divide by zero, which could occur should a certain day no requests. I also assume here that the same invitation will not be sent out more than once.

这篇关于使用LEFT JOIN和SELF JOIN以及汇总函数计算accepting_ratio的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆