Firebase 数据到 Google BigQuery [英] Firebase data to Google BigQuery

查看:31
本文介绍了Firebase 数据到 Google BigQuery的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Firebase 在 Google 上提供 私人备份云存储.其中一个特色用例是提取到分析产品中":

Private Backups 为云分析产品(例如 Google 的 BigQuery)提供了完美的管道.云分析产品通常更喜欢通过云存储存储桶而不是直接从应用程序中提取数据.

我在 Firebase 中有大量数据(导出到 Cloud Storage 存储桶时超过 1GB),并且如 Firebase 产品中所述,我想将这些数据放入 Big Query.

但是真的有可能编写一个适合 Firebase 原始数据的表架构吗?让我们以 Firebase 文档中的 dinosaur-facts 数据库为例.JSON 如下所示:

{恐龙":{布鲁哈斯卡约龙":{出现":-70000000,高度":25},兰伯龙":{出现":-76000000,高度":2.1}},分数":{布鲁哈斯卡约龙":55,兰伯龙":21}}

要列出所有恐龙,我想唯一的方法是在 bigQuery 模式中使用 RECORD 字段.但通常 BigQuery 中的 RECORDS 对应于导入的 JSON 中的数组.Firebase 中没有数组,只有一个以恐龙名称作为键名的对象.

因此,像这样的 BigQuery 表架构不起作用:

<预><代码>[{"name": "恐龙","类型": "记录",模式":需要",领域":[{"name": "恐龙","类型": "记录",模式":重复",领域":[{"name": "出现",类型":整数"},{"name": "高度",类型":整数"},{"name": "长度",类型":整数"},{"name": "订单",类型":字符串"},{"name": "消失了",类型":整数"},{"name": "重量",类型":整数"}]},{"name": "分数","类型": "记录",模式":重复",领域":[{"name": "恐龙",类型":整数"}]}]}]

是否可以编写适合 Firebase 原始数据的表架构?或者我们应该先准备数据以使其与 BigQuery 兼容?

解决方案

在写这篇 03/2017 时,我可以确认 Firebase 实时数据库和 BigQuery 之间没有真正的集成.只有 Firebase Analytics 可以轻松导入 BigQuery.Firebase 上也没有清楚地解释所有这些......

我们最终编写了自己的解决方案,但您可以查看此Github 存储库 有大约 400 多个星,所以我假设有些人觉得它很有用......

Firebase offers private backups on Google Cloud Storage. One of the featured use case is "Ingestion into Analytics Products":

Private Backups provides a perfect pipeline into cloud analytics products such as Google’s BigQuery. Cloud Analytics products often prefer to ingest data through Cloud Storage buckets rather than directly from the application.

I have a lot of data in Firebase (more than 1GB when exported to a Cloud Storage bucket) and, as described in Firebase offering, I wanted to put those data in Big Query.

But is it really possible to write a table schema that fits Firebase raw data? Let's take as an example the dinosaur-facts database from Firebase documentation. The JSON looks like this:

{
  "dinosaurs" : {
    "bruhathkayosaurus" : {
      "appeared" : -70000000,
      "height" : 25
    },
    "lambeosaurus" : {
      "appeared" : -76000000,
      "height" : 2.1
    }
  },
  "scores" : {
    "bruhathkayosaurus" : 55,
    "lambeosaurus" : 21
  }
}

To list all dinosaurs, I suppose the only way would be to use a RECORD field in bigQuery schema. But usually RECORDS in BigQuery correspond to an array in the imported JSON. And there's no array here in Firebase, just an object with dinosaur names as the key names.

So a BigQuery table schema like this doesn't work:

[
    {
        "name": "dinosaurs",
        "type": "RECORD",
        "mode": "REQUIRED",
        "fields": [
            {
                "name": "dinosaur",
                "type": "RECORD",
                "mode": "REPEATED",
                "fields": [
                    {
                        "name": "appeared",
                        "type": "INTEGER"
                    },
                    {
                        "name": "height",
                        "type": "INTEGER"
                    },
                    {
                        "name": "length",
                        "type": "INTEGER"
                    },
                    {
                        "name": "order",
                        "type": "STRING"
                    },
                    {
                        "name": "vanished",
                        "type": "INTEGER"
                    },
                    {
                        "name": "weight",
                        "type": "INTEGER"
                    }
                ]
            },
            {
                "name": "scores",
                "type": "RECORD",
                "mode": "REPEATED",
                "fields": [
                    {
                        "name": "dinosaur",
                        "type": "INTEGER"
                    }
                ]
            }
        ]
    }
]

Is it possible to write a table schema that fits Firebase raw data? Or should we first prepare the data to make it compatible with BigQuery?

解决方案

When writing this 03/2017, I can confirm that there's no real integration between Firebase Realtime database and BigQuery. Only Firebase Analytics can be imported easily into BigQuery. All this is not clearly explained on Firebase either...

We ended up writing our own solution, but you can check out this Github repo that has some 400+ stars, so I am assuming a few people found it useful...

这篇关于Firebase 数据到 Google BigQuery的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆