将 hive 表迁移到 Google BigQuery [英] Migrate hive table to Google BigQuery

查看：26 发布时间：2021/12/15 19:22:59 hadoop hive google-bigquery google-cloud-platform

本文介绍了将 hive 表迁移到 Google BigQuery的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试设计一种数据管道来将我的 Hive 表迁移到 BigQuery.Hive 运行在本地集群的 Hadoop 上.这是我目前的设计，其实很简单，就是一个shell脚本:

I am trying to design a sort of data pipeline to migrate my Hive tables into BigQuery. Hive is running on an Hadoop on premise cluster. This is my current design, actually, it is very easy, it is just a shell script:

对于每个表 source_hive_table {

for each table source_hive_table {

INSERT 覆盖表 target_avro_hive_table SELECT * FROM source_hive_table;
使用 distcp
创建第一个 BQ 表:bq load --source_format=AVRO your_dataset.something something.avro
处理来自 BigQuery 本身的任何转换问题，因此从刚刚编写的表中选择并手动处理任何转换

}

你觉得有道理吗?有没有更好的方法，也许使用 Spark?我对处理转换的方式不满意，我想避免两次创建 BigQuery 表.

Do you think it makes sense? Is there any better way, perhaps using Spark? I am not happy about the way I am handling the casting, I would like to avoid creating the BigQuery table twice.

将 hive 表迁移到 Google BigQuery [英] Migrate hive table to Google BigQuery

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

将 hive 表迁移到 Google BigQuery [英] Migrate hive table to Google BigQuery

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭