我应该何时为导出到BigQuery的Firebase Analytics数据运行每日ETL作业? [英] When should I run daily ETL jobs for Firebase Analytics data exported to BigQuery?

查看:95
本文介绍了我应该何时为导出到BigQuery的Firebase Analytics数据运行每日ETL作业?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用Firebase Analytics从我们的应用程序收集事件。我们已将事件导出到BigQuery。每天我们运行一些ETL作业以在BigQuery中创建更友好的分析表(例如会话,购买)。

We use Firebase Analytics to collect events from our apps. We have enabled events export to BigQuery. Every day we run some ETL jobs to create more friendly analytics tables in BigQuery (e.g. sessions, purchases).

问题是我们什么时候应该运行这些ETL作业?

The question is when should we run these ETL jobs?

我们知道Firebase Analytics在BigQuery中创建了 events_intraday_表,该表在午夜后的几个小时后更改为 events_。我们还了解到,如果客户端未与互联网连接,则可能会在以后报告某些事件,但这不是问题。

We know that Firebase Analytics creates in BigQuery 'events_intraday_' table which is changed to 'events_' after some hours after midnight. We also understand that some events might be reported later if client is not connected with the internet, but this is not the problem.

我们的理论是'events_intraday_'表为某种临时表,当它更改为 events_时,我们应该运行ETL作业。不幸的是,我们找不到任何有关它的文档。这是一个好的解决方案吗?

Our theory is that 'events_intraday_' table is some kind of temporary table and we should run ETL jobs when it changes to 'events_'. Unfortunately we could not find any documentation about it. Is this good solution?

推荐答案

感谢弗兰克·范·普菲伦我在Firebase Blog
,其中说:我的Firebase-analytics-data-to-show-up.html rel = nofollow noreferrer>需要多长时间?导出到BigQuery的分析数据最多可以延迟1小时以上。因此,基于此信息,应该运行ETL作业,比如说UTC + 0凌晨2点,并且查询应该只是带有events_intraday表的UNION ALL事件。

Thanks to Frank van Puffelen I've found article on Firebase Blog How Long Does it Take for My Firebase Analytics Data to Show Up?, which says that analytics data exported to BigQuery can be delayed up to little more than 1 hour. So based on this information ETL jobs should be runned about, lets say 2 AM UTC+0 and query should just UNION ALL events with events_intraday table.

所以如果今天是2019年-04-02,我想查询上个月的数据,查询应该像这样:

So if today is 2019-04-02 and I want to query data from last month, the query should look like:

SELECT * FROM
(
  SELECT * 
  FROM `<PROJECT_ID>.analytics_<ANALYTICS_ID>.events_*`
  WHERE _TABLE_SUFFIX BETWEEN '20190301' AND '20190401'
)
UNION ALL 
(
  SELECT * 
  FROM `<PROJECT_ID>.analytics_<ANALYTICS_ID>.events_intraday_*` 
  WHERE _TABLE_SUFFIX = '20190401'
)

这篇关于我应该何时为导出到BigQuery的Firebase Analytics数据运行每日ETL作业?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆