如何用BigQuery来计算DAU / MAU(参与) [英] How to calculate DAU/MAU with BigQuery (engagement)

查看:282
本文介绍了如何用BigQuery来计算DAU / MAU(参与)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

DAU和MAU(每日活跃用户和每月活跃用户)是衡量用户参与度的既定方式。



如何使用SQL和Google BigQuery获取这些数字?

解决方案

< (要了解DAU / MAU的实用程序,请参阅


DAU and MAU (daily active users and monthly active users) are an established way of measuring user engagement.

How can I get these numbers using SQL and Google BigQuery?

解决方案

(to understand the utility of DAU/MAU see articles like http://blog.compariscope.wefi.com/mobile-app-usage-dau-mau)

Let's play with the reddit comments data stored in BigQuery. We want to find out the dau/mau ratio for the 'AskReddit' subreddit during September on a daily rolling basis:

SELECT day, dau, mau, INTEGER(100*dau/mau) daumau
FROM (
  SELECT day, EXACT_COUNT_DISTINCT(author) dau, FIRST(mau) mau
  FROM (
    SELECT DATE(SEC_TO_TIMESTAMP(created_utc)) day, author
    FROM [fh-bigquery:reddit_comments.2015_09]
    WHERE subreddit='AskReddit') a
  JOIN (
    SELECT stopday, EXACT_COUNT_DISTINCT(author) mau
    FROM (SELECT created_utc, subreddit, author FROM [fh-bigquery:reddit_comments.2015_09], [fh-bigquery:reddit_comments.2015_08]) a
    CROSS JOIN (
      SELECT DATE(SEC_TO_TIMESTAMP(created_utc)) stopday
      FROM [fh-bigquery:reddit_comments.2015_09]
      GROUP BY 1
    ) b
    WHERE subreddit='AskReddit'
    AND SEC_TO_TIMESTAMP(created_utc) BETWEEN DATE_ADD(stopday, -30, 'day') AND TIMESTAMP(stopday)
    GROUP BY 1
  ) b
  ON a.day=b.stopday
  GROUP BY 1
)
ORDER BY 1

This query gets DAU for each day in September, and looks also into August data to get the MAU for each 30 day period ending in each DAU day. That takes a lot of processing (30x), and we can get almost equivalent results if we only calculate one MAU for September, and proceed to use that value as the denominator:

SELECT day, dau, mau, INTEGER(100*dau/mau) daumau
FROM (
  SELECT DATE(SEC_TO_TIMESTAMP(created_utc)) day, EXACT_COUNT_DISTINCT(author) dau, FIRST(mau) mau
  FROM [fh-bigquery:reddit_comments.2015_09] a
  CROSS JOIN (
    SELECT EXACT_COUNT_DISTINCT(author) mau
    FROM [fh-bigquery:reddit_comments.2015_09]
    WHERE subreddit='AskReddit'
  ) b
  WHERE subreddit='AskReddit'
  GROUP BY 1
)
ORDER BY 1

That's a much simpler query that brings us almost equivalent results much faster.

Now to get an average value for this subreddit for the month:

SELECT ROUND(100*AVG(dau/mau), 2) daumau
FROM (
  SELECT DATE(SEC_TO_TIMESTAMP(created_utc)) day, EXACT_COUNT_DISTINCT(author) dau, FIRST(mau) mau
  FROM [fh-bigquery:reddit_comments.2015_09] a
  CROSS JOIN (
    SELECT EXACT_COUNT_DISTINCT(author) mau
    FROM [fh-bigquery:reddit_comments.2015_09]
    WHERE subreddit='AskReddit'
  ) b
  WHERE subreddit='AskReddit'
  GROUP BY 1
)

This tells us that 'AskReddit' had an engagement of 8.95% during September.

Last stop, how to compare engagement within various subreddits:

SELECT ROUND(100*AVG(dau)/MAX(mau), 2) avg_daumau, MAX(mau) mau, subreddit
FROM (
  SELECT a.subreddit, DATE(SEC_TO_TIMESTAMP(created_utc)) day,
         EXACT_COUNT_DISTINCT(author) dau, FIRST(mau) mau
  FROM [fh-bigquery:reddit_comments.2015_09] a
  JOIN (
    SELECT EXACT_COUNT_DISTINCT(author) mau, subreddit
    FROM [fh-bigquery:reddit_comments.2015_09]
    GROUP BY 2
  ) b
  ON a.subreddit=b.subreddit
  WHERE mau>50000
  GROUP BY 1, 2
)

GROUP BY subreddit
ORDER BY 1

这篇关于如何用BigQuery来计算DAU / MAU(参与)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆