Firebase数据传输到Google BigQuery [英] Firebase data to Google BigQuery

查看:380
本文介绍了Firebase数据传输到Google BigQuery的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Firebase提供 Google上的私人备份云存储。其中一个特色用例是分析产品的摄入:

 私有备份为云分析产品Google的BigQuery云端分析产品通常倾向于通过云存储存储桶而不是直接从应用程序获取数据。 

Firebase中有大量数据(导出到云端存储空间时超过1GB), ,正如Firebase产品中所描述的那样,我想把这些数据放在Big Query中。但是真的可以编写一个适合Firebase原始数据的表格模式吗?
以Firebase文档中的恐龙事实数据库为例。
JSON如下所示:


$ b

{
dinosaurs:{
bruhathkayosaurus:{
出现:-70000000,
高度:25
},
lambeosaurus :{
出现:-76000000,
高度:2.1
}
},
分数:{
bruhathkayosaurus :55,
lambeosaurus:21
}
}

要列出所有的恐龙,我想唯一的方法是在bigQuery模式中使用RECORD字段。但通常BigQuery中的记录对应于导入的JSON中的数组。 Firebase中没有数组,只是一个以恐龙名称作为关键名称的对象。

因此,像这样的BigQuery表架构不起作用:


  [
{
name:dinosaurs,
type RECORD,
mode:REQUIRED,
fields:[
{
name:dinosaur,
type RECORD,
mode:REPEATED,
fields:[
{
name:出现,
type INTEGER
},
{
name:height,
type:INTEGER
},
{
name:length,
type:INTEGER
},
{
name:order,
type:STRING
},
{
name:消失,
type INTEGER
},
{
name:weight,
type:INTEGER
}
]

name:scores,
type:RECORD,
mode:REPEATED,
fields :[
{
name:dinosaur,
type:INTEGER
}
]
}
]
}
]

可以编写一个适合Firebase原始数据的表格模式?或者我们应该先准备数据以使其与BigQuery兼容? 由于上面的数据只是JSON,你应该能够使其与Firebase一起使用。不过,我认为在备份之后准备数据要容易得多

您提到Firebase数据中没有数组。 Firebase不支持阵列,但必须符合一定的标准。

  //我们发送这个
['a','b' ,'c','d','e']
// Firebase存储这个
{0:'a',1:'b',2:'c',3:'d' ,4:'e'}
//因为键是数字和顺序的,所以
//如果我们查询数据,我们得到这个
['a','b',' c','d','e']

即使它看起来像Firebase数据库,它会在查询时作为数组返回。



因此,在您的Firebase数据库中创建模式是可行的,但这可能会产生大量开销为您的应用程序。


Firebase offers private backups on Google Cloud Storage. One of the featured use case is "Ingestion into Analytics Products":

Private Backups provides a perfect pipeline into cloud analytics products such as Google’s BigQuery. Cloud Analytics products often prefer to ingest data through Cloud Storage buckets rather than directly from the application.

I have a lot of data in Firebase (more than 1GB when exported to a Cloud Storage bucket) and, as described in Firebase offering, I wanted to put those data in Big Query.

But is it really possible to write a table schema that fits Firebase raw data? Let's take as an example the dinosaur-facts database from Firebase documentation. The JSON looks like this:

{
  "dinosaurs" : {
    "bruhathkayosaurus" : {
      "appeared" : -70000000,
      "height" : 25
    },
    "lambeosaurus" : {
      "appeared" : -76000000,
      "height" : 2.1
    }
  },
  "scores" : {
    "bruhathkayosaurus" : 55,
    "lambeosaurus" : 21
  }
}

To list all dinosaurs, I suppose the only way would be to use a RECORD field in bigQuery schema. But usually RECORDS in BigQuery correspond to an array in the imported JSON. And there's no array here in Firebase, just an object with dinosaur names as the key names.

So a BigQuery table schema like this doesn't work:

[
    {
        "name": "dinosaurs",
        "type": "RECORD",
        "mode": "REQUIRED",
        "fields": [
            {
                "name": "dinosaur",
                "type": "RECORD",
                "mode": "REPEATED",
                "fields": [
                    {
                        "name": "appeared",
                        "type": "INTEGER"
                    },
                    {
                        "name": "height",
                        "type": "INTEGER"
                    },
                    {
                        "name": "length",
                        "type": "INTEGER"
                    },
                    {
                        "name": "order",
                        "type": "STRING"
                    },
                    {
                        "name": "vanished",
                        "type": "INTEGER"
                    },
                    {
                        "name": "weight",
                        "type": "INTEGER"
                    }
                ]
            },
            {
                "name": "scores",
                "type": "RECORD",
                "mode": "REPEATED",
                "fields": [
                    {
                        "name": "dinosaur",
                        "type": "INTEGER"
                    }
                ]
            }
        ]
    }
]

Is it possible to write a table schema that fits Firebase raw data? Or should we first prepare the data to make it compatible with BigQuery?

解决方案

Since the data above is just JSON, you should be able to get it to work with Firebase. However, I think that it would be much easier to prepare the data after the backup.

You mentioned that there was no arrays in the Firebase data. Firebase does support arrays, but they have to meet a certain criteria.

// we send this
['a', 'b', 'c', 'd', 'e']
// Firebase stores this
{0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e'}
// since the keys are numeric and sequential,
// if we query the data, we get this
['a', 'b', 'c', 'd', 'e']

Even though it may look like an object in the Firebase database, it will come back as an array when queried.

So it is feasible to create your schema in your Firebase database, but it would likely create a lot of overhead for your application.

这篇关于Firebase数据传输到Google BigQuery的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆