SQL-不相等的左联接BigQuery [英] SQL - Unequal left join BigQuery

查看:71
本文介绍了SQL-不相等的左联接BigQuery的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这里是新的.我正在努力吸引一段时间内的每日和每周活跃用户.他们有30天的时间才被视为处于非活动状态.我的目标是创建可由user_id拆分的图表,以显示同类群组,区域,类别等.

New here. I am trying to get the Daily and Weekly active users over time. they have 30 days before they are considered inactive. My goal is to create graph's that can be split by user_id to show cohorts, regions, categories, etc.

我创建了一个日期表来获取该时间段的每一天,并且我有简化的订单表,其中包含我需要计算的基本信息.

I have created a date table to get every day for the time period and I have the simplified orders table with the base info that I need to calculate this.

我正在尝试进行左联接以使用以下SQL查询按日期获取状态:

I am trying to do a Left Join to get the status by date using the following SQL Query:

WITH daily_use AS (
        SELECT
          __key__.id AS user_id
          , DATE_TRUNC(date(placeOrderDate), day) AS activity_date
        FROM `analysis.Order`
        where isBuyingGroupOrder = TRUE 
          AND testOrder = FALSE
        GROUP BY 1, 2
 ),
dates AS (
        SELECT DATE_ADD(DATE "2016-01-01", INTERVAL d.d DAY) AS date
        FROM
          (
           SELECT ROW_NUMBER() OVER(ORDER BY __key__.id) -1 AS d
           FROM `analysis.Order`
           ORDER BY __key__.id
           LIMIT 1096
          ) AS  d
        ORDER BY 1 DESC
      )

SELECT
      daily_use.user_id
    , wd.date AS date
    , MIN(DATE_DIFF(wd.date, daily_use.activity_date, DAY)) AS days_since_last_action
FROM dates AS wd

LEFT JOIN daily_use
    ON wd.date >= daily_use.activity_date
    AND wd.date < DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)

GROUP BY 1,2

我遇到此错误:如果没有连接两面的字段相等的条件,则无法使用LEFT OUTER JOIN.在BigQuery中,我想知道如何解决这个问题.我在BigQuery中使用标准SQL.

I am getting this Error: LEFT OUTER JOIN cannot be used without a condition that is an equality of fields from both sides of the join. In BigQuery and was wondering how can I go around this. I am using Standard SQL within BigQuery.

谢谢

推荐答案

以下内容适用于BigQuery Standard SQL,并且除了不包括根本没有活动的日子外,大部分都在查询中重现逻辑

Below is for BigQuery Standard SQL and mostly reproduce logic in your query with exception of not including days where no activity at all is found

#standardSQL
SELECT
    daily_use.user_id
  , wd.date AS DATE
  , MIN(DATE_DIFF(wd.date, daily_use.activity_date, DAY)) AS days_since_last_action
FROM dates AS wd
CROSS JOIN daily_use
WHERE wd.date BETWEEN 
  daily_use.activity_date AND DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)
GROUP BY 1,2
-- ORDER BY 1,2

如果出于任何原因仍然需要exactly重现您的逻辑-您可以通过如下所示的最终左连接来拥抱上面的内容:

if for whatever reason you still need to exactly reproduce your logic - you can embrace above with final left join as below:

#standardSQL
SELECT *
FROM dates AS wd
LEFT JOIN (
  SELECT
    daily_use.user_id
    , wd.date AS date
    , MIN(DATE_DIFF(wd.date, daily_use.activity_date, DAY)) AS days_since_last_action
  FROM dates AS wd
  CROSS JOIN daily_use
  WHERE wd.date BETWEEN 
    daily_use.activity_date AND DATE_ADD(daily_use.activity_date, INTERVAL 30 DAY)
  GROUP BY 1,2
) AS daily_use
USING (date)
-- ORDER BY 1,2

这篇关于SQL-不相等的左联接BigQuery的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆