使用Google Big Query构建基本渠道 [英] Building a basic funnel using Google Big Query

查看:57
本文介绍了使用Google Big Query构建基本渠道的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到有很多使用Google BigQuery的Google Analytics(分析)用户,但是文档非常有限.是否可以帮助生成一个简单的渠道,以显示访问过/pageA,/pageB和/pageC

I noticed that there are a lot of users of Google Analytics with Google BigQuery but the documentation is quite limited. Is it possible to help generate a simple funnel that shows Users who visited /pageA then /pageB and then /pageC

我看到了许多不同的方法-而且我不清楚执行此操作的正确"方法是什么.

I have seen lots of different approaches - and I am not clear what the "correct" way is to do this.

推荐答案

您可以先使用array_concat_agg()连接用户的点击量,然后根据新的用户范围表进行计算.当然,这很大程度上取决于您选择的时间范围.

You can concatenate hits of users first with array_concat_agg() and then do your calculations based on the new user scoped table. It highly depends on your chosen time frame of course.

例如,其中包含来自Google的伪数据:

Here for instance with dummy data from Google:

#standardSQL
WITH arrAgg AS (
  SELECT
    fullvisitorid,
    -- concatenate arrays over multiple sessions
    ARRAY_CONCAT_AGG(hits ORDER BY visitstarttime ASC) userHits
  FROM
    `google.com:analytics-bigquery.LondonCycleHelmet.ga_sessions_20130910`
  GROUP BY 1
)
, journey AS (
  SELECT 
    fullvisitorId,
    -- get a proper running index with combination of unnest and offset of aggregated hits array
    ARRAY( (SELECT AS STRUCT index+1 as hitNumber, page FROM UNNEST(userHits) WITH OFFSET AS index)) as hits
  FROM arrAgg
)

SELECT * FROM journey

运行此命令时,您可以看到新的原始材料".第一步,将点击数连接起来,第二步,为页面创建适当的索引,然后将所有内容重新放入匹配"数组中.

When you run this you can see the new "raw material". In the first step I concatenate the hits, in the second step I make a proper index for the pages and put eerything back into a "hits" array.

您可以使用交叉联接并比较页面的步骤和顺序来建立用户旅程:

You can build your user journey using cross joins and comparing steps and sequence of the page:

#standardSQL
WITH arrAgg AS (
  SELECT
    fullvisitorid,
    SUM(totals.visits) sessions,
    -- concatenate arrays over multiple sessions
    ARRAY_CONCAT_AGG(hits ORDER BY visitstarttime ASC) userHits
  FROM
    `google.com:analytics-bigquery.LondonCycleHelmet.ga_sessions_20130910`
  GROUP BY 1
)
, journey AS (
  SELECT 
    fullvisitorId,
    sessions,
    -- get a proper running index with combination of unnest and offset of aggregated hits array
    ARRAY( (SELECT AS STRUCT index+1 as hitNumber, page FROM UNNEST(userHits) WITH OFFSET AS index WHERE type='PAGE')) as hits
  FROM arrAgg
)
-- funnel: homepage: /, login: /login.html, basket: /basket.html, confirm: /confirm.html
SELECT 
  SUM(sessions) allSessions,
  COUNT(1) allUsers,
  -- check if any page was home page
  SUM( (SELECT IF( LOGICAL_OR(page.pagePath='/'), 1, 0) FROM j.hits) ) step1_home,
  -- cross join hits array with itself: combination of all pages with all pages: any of those combinations our two pages? came home before login?: if yes for any given amount add up 1
  SUM( (SELECT IF( LOGICAL_OR(a.page.pagePath='/' AND b.page.pagePath='/login.html' AND a.hitNumber < b.hitNumber) ,1, 0 ) FROM j.hits a CROSS JOIN j.hits b) ) step2_login,
  -- extend cross join principle to a third page
  SUM( (SELECT IF( LOGICAL_OR(
      a.page.pagePath='/' AND b.page.pagePath='/login.html' AND c.page.pagePath='/basket.html' AND
      a.hitNumber < b.hitNumber AND b.hitNumber < c.hitNumber 
      ) ,1, 0 ) FROM j.hits a CROSS JOIN j.hits b CROSS JOIN j.hits c) ) step3_basket,
  -- extend cross join principle to a fourth page
  SUM( (SELECT IF( LOGICAL_OR(
      a.page.pagePath='/' AND b.page.pagePath='/login.html' AND c.page.pagePath='/basket.html' AND d.page.pagePath='/confirm.html' AND
      a.hitNumber < b.hitNumber AND b.hitNumber < c.hitNumber AND c.hitNumber < d.hitNumber
      ) ,1, 0 ) FROM j.hits a CROSS JOIN j.hits b CROSS JOIN j.hits c CROSS JOIN j.hits d) ) step4_confirm
FROM journey j

由于所有操作都对数组中的子查询进行操作,因此由于并行化,它应该可以很好地扩展. 请在使用它之前进行测试-我没有;)但是它应该指向正确的方向.

Since everything operates with subqueries on arrays it should scale quite well, because of parallelization. Please test it before using it - I didn't ;) But it should point in the right direction.

这篇关于使用Google Big Query构建基本渠道的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆