SQL - 不等左连接 BigQuery [英] SQL - Unequal left join BigQuery

查看:16
本文介绍了SQL - 不等左连接 BigQuery的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

新来的.随着时间的推移,我试图获得每日和每周的活跃用户.他们有 30 天的时间被视为不活跃.我的目标是创建可以按 user_id 拆分的图表,以显示群组、地区、类别等.

New here. I am trying to get the Daily and Weekly active users over time. they have 30 days before they are considered inactive. My goal is to create graph's that can be split by user_id to show cohorts, regions, categories, etc.

我创建了一个日期表来获取该时间段的每一天,并且我有一个简化的订单表,其中包含我需要计算的基本信息.

I have created a date table to get every day for the time period and I have the simplified orders table with the base info that I need to calculate this.

我正在尝试使用以下 SQL 查询执行左连接以按日期获取状态:

I am trying to do a Left Join to get the status by date using the following SQL Query:

WITH daily_use AS (
        SELECT
          __key__.id AS user_id
          , DATE_TRUNC(date(placeOrderDate), day) AS activity_date
        FROM `analysis.Order`
        where isBuyingGroupOrder = TRUE 
          AND testOrder = FALSE
        GROUP BY 1, 2
 ),
dates AS (
        SELECT DATE_ADD(DATE "2016-01-01", INTERVAL d.d DAY) AS date
        FROM
          (
           SELECT ROW_NUMBER() OVER(ORDER BY __key__.id) -1 AS d
           FROM `analysis.Order`
           ORDER BY __key__.id
           LIMIT 1096
          ) AS  d
        ORDER BY 1 DESC
      )

SELECT
      daily_use.user_id
    , wd.date AS date
    , MIN(DATE_DIFF(wd.date, daily_use.activity_date, DAY)) AS days_since_last_action
FROM dates AS wd

LEFT JOIN daily_use
    ON wd.date >= daily_use.activity_date
    AND wd.date < DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)

GROUP BY 1,2

我收到此错误:如果不满足连接两侧字段相等的条件,则无法使用 LEFT OUTER JOIN.在 BigQuery 中,我想知道如何解决这个问题.我在 BigQuery 中使用标准 SQL.

I am getting this Error: LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join. In BigQuery and was wondering how can I go around this. I am using Standard SQL within BigQuery.

谢谢

推荐答案

以下适用于 BigQuery 标准 SQL,主要重现查询中的逻辑,但不包括根本没有发现任何活动的日子

Below is for BigQuery Standard SQL and mostly reproduce logic in your query with exception of not including days where no activity at all is found

#standardSQL
SELECT
    daily_use.user_id
  , wd.date AS DATE
  , MIN(DATE_DIFF(wd.date, daily_use.activity_date, DAY)) AS days_since_last_action
FROM dates AS wd
CROSS JOIN daily_use
WHERE wd.date BETWEEN 
  daily_use.activity_date AND DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)
GROUP BY 1,2
-- ORDER BY 1,2

如果出于某种原因您仍然需要完全重现您的逻辑 - 您可以在上面使用最终左连接,如下所示:

if for whatever reason you still need to exactly reproduce your logic - you can embrace above with final left join as below:

#standardSQL
SELECT *
FROM dates AS wd
LEFT JOIN (
  SELECT
    daily_use.user_id
    , wd.date AS date
    , MIN(DATE_DIFF(wd.date, daily_use.activity_date, DAY)) AS days_since_last_action
  FROM dates AS wd
  CROSS JOIN daily_use
  WHERE wd.date BETWEEN 
    daily_use.activity_date AND DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)
  GROUP BY 1,2
) AS daily_use
USING (date)
-- ORDER BY 1,2

这篇关于SQL - 不等左连接 BigQuery的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆