在BigQuery上重新创建GA漏斗 [英] Recreate GA Funnel on BigQuery

查看:177
本文介绍了在BigQuery上重新创建GA漏斗的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用BigQuery重新创建GA漏斗(Google360上的自定义报告)。 GA上的漏斗使用每页上发生的事件的唯一计数。我在网上发现了大部分工作的查询:

  SELECT 
COUNT(s0.firstHit)AS Landing_Page ,
COUNT(s1.firstHit)AS Model_Selection
from(
SELECT
s0.fullvisitorID,
s0.firstHit,
s1.firstHit,
FROM(
#开始子查询#1 aka s0
SELECT
fullvisitorID,
MIN(hits.hitNumber)AS firstHit
FROm [64269470.ga_sessions_20170720] $ b (''landing_page')$ b $ AND WHERE
WHITS
hits.eventInfo.eventAction AND totals.visits = 1
GROUP BY
fullvisitorID
)s0
# End Subquery#1 aka s0

left join(

#开始子查询#2又名s1
SELECT
fullvisitorID,
MIN(点击次数.hitNumber)AS firstHit
FROM [64269470.ga_ ('model_selection_page')
AND totals.visits = 1
GROUP BY
fullvisitorID,
)s1中的
WHERE
hits.eventInfo.eventAction

ON
s0.fullvisitorID = s1.fullvisitorID


查询工作正常,着陆页的值与GA上的相同,但Model_Selection高出约10%。这个差异也随着漏斗的增加而增加(为了清楚起见,我只发布了两个步骤)。
任何想法我在这里想念什么?

解决方案

此查询可以满足您的需求,但只需标准SQL 版本:

  #standardSQL 
SELECT
SUM((SELECT COUNTIF(eventInfo.eventAction ='landing_page')FROM UNNEST(hits)))Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction ='model_selection_page') FROM UNNEST(hits)WHERE EXISTS(SELECT 1 FROM UNNEST(hits)WHERE eventInfo.eventAction ='landing_page')))Model_Selection
FROM`64269470.ga_sessions_20170720`

就是这样。 4线,方式更快,更便宜。



您也可以玩模拟数据,如:

  #standardSQL 
WITH data AS(
SELECT'1'as fullvisitorid,ARRAY< STRUCT< eventInfo STRUCT< eventAction STRING>>>> [STRUCT(STRUCT(' (STRING)(STRUCT(STRUCT('landing_page'AS eventAction)AS eventInfo)] AS hits UNION ALL
SELECT'1'AS fullvisitorid,ARRAY< STRUCT< eventInfo STRUCT< eventAction STRING>>> AS STRING< STRUCT< eventInfo STRUCT< eventAction STRING>>> [STRUCT('landing_page'AS eventAction)AS eventInfo)] AS hit CHUNK ALL ALL
SELECT'1' (STRUCT('landing_page'AS eventAction)AS eventInfo),STRUCT(STRUCT('model_selection_page'AS eventAction)AS eventInfo)] AS hit UNION ALL
SELECT'1'AS fullvisitorid,ARRAY< STRUCT< eventInfo STRUCT< eventAction STRING>>> [S TRUCT(STRUCT('model_selection_page'AS eventAction)AS eventInfo),STRUCT(STRUCT('model_selection_page'AS eventAction)AS eventInfo)] AS hits


SELECT
SUM (SELECT COUNTIF(eventInfo.eventAction ='landing_page')FROM UNNEST(hits)))Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction ='model_selection_page')FROM UNNEST(hits)WHERE EXISTS(SELECT 1 FROM UNNEST(hits)WHERE eventInfo.eventAction ='landing_page')))Model_Selection
FROM data

请注意,在GA中构建这种类型的报告可能会更困难一些,因为您需要选择至少在事件'landing_page'触发后触发事件'model_selection_page'的访问者。确保你的GA中正确地建立了这个报告(一种方法可能是先建立一个自定义报告,只有'landing_page'被激活的客户,然后应用第二个过滤器寻找'model_selection_page')。




您在评论中提出了有关在会话和用户级别进行计数的问题。为了计算每个会话,您可以将结果限制为1,如下所示:

  SELECT 
SUM((SELECT 1 FROM UNNEST(hits)WHERE eventInfo.eventAction ='landing_page'LIMIT 1))Landing_Page,
SUM((SELECT 1 FROM UNNEST(hits)WHERE EXISTS(SELECT 1 FROM UNNEST(hits)WHERE Model_Selection
FROM data


$ eventInfo.eventAction ='landing_page')AND eventInfo.eventAction ='model_selection_page'LIMIT 1) b $ b

为了计算不同的用户,这个想法是相同的,但是您必须应用 COUNT(DISTINCT)操作,如下所示:

  SELECT 
COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits)WHERE eventInfo.eventAction ='landing_page'LIMIT 1))Landing_Page,
COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits)WHERE EXISTS(SELECT 1 FROM UNNEST(hits)WHERE eventInfo.eventAction ='landing_page')AND eventInfo.eventAction ='model_selection_页'LIMIT 1))Model_Selection
FROM data


I am trying to recreate the GA funnel (custom report on Google360) using BigQuery. The funnel on GA is using the unique count of events that happen on each page. I found this query online that is working for the most part:

SELECT
  COUNT( s0.firstHit) AS Landing_Page,
  COUNT( s1.firstHit) AS Model_Selection
from(
SELECT
      s0.fullvisitorID,
      s0.firstHit,
      s1.firstHit,
    FROM (
            # Begin Subquery #1 aka s0
            SELECT
                    fullvisitorID,
                    MIN(hits.hitNumber) AS firstHit
            FROm [64269470.ga_sessions_20170720]
            WHERE
                  hits.eventInfo.eventAction  in ('landing_page') 
                    AND totals.visits = 1
            GROUP BY
                  fullvisitorID
                  ) s0
    # End Subquery #1 aka s0

    left join (

    # Begin Subquery #2 aka s1
          SELECT
              fullvisitorID,
              MIN(hits.hitNumber) AS firstHit
          FROM [64269470.ga_sessions_20170720]
          WHERE
            hits.eventInfo.eventAction  in ('model_selection_page')
            AND totals.visits = 1
          GROUP BY
                fullvisitorID,
                ) s1

      ON
    s0.fullvisitorID = s1.fullvisitorID

    )

The query works fine and the value for landing page is the same as I can get on GA, but Model_Selection is about 10% higher. This difference also increases along the funnel (I only posted 2 steps for clarity). Any idea what am I missing here?

解决方案

This query does what you need but in Standard SQL Version:

#standardSQL
SELECT 
  SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
  SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection  
FROM `64269470.ga_sessions_20170720`

Just that. 4 lines, way faster and cheaper.

You can also play with simulated data, something like:

#standardSQL
WITH data AS(
  SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
  SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
  SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
  SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits
)

SELECT 
  SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
  SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection  
FROM data

Notice that building this type of report in GA might be a bit more difficult as you need to select visitors who had at least fired once the event 'landing_page' and then had the event 'model_selection_page' fired. Make sure you got this report built correctly as well in your GA (one way might be to first build a customized report with only customers who had 'landing_page' fired and then apply the second filter looking for 'model_selection_page').

[EDIT]:

You asked in your comment about bringing this counting on the session and user level. For counting each session, you can limit the results to 1 for each sub-query evaluation, like so:

SELECT 
  SUM((SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
  SUM((SELECT 1 FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection  
FROM data

For counting distinct users, the idea is the same but you'd have to apply a COUNT(DISTINCT) operation, like so:

SELECT 
  COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
  COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection  
FROM data

这篇关于在BigQuery上重新创建GA漏斗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆