在Google Big查询中将表格从一个数据集复制到另一个数据集 [英] Copy table from one dataset to another in google big query

查看:164
本文介绍了在Google Big查询中将表格从一个数据集复制到另一个数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我打算在同一项目中将一组表从一个数据集复制到另一个数据集。我在Ipython notebook中执行代码。

I intend to copy a set of tables from one dataset to another within the same project. I execute the code in Ipython notebook.

我使用以下代码获取要复制到变量 value中的表名列表:

I get the list of table names to be copied in the variable "value" using the below code:

list = bq.DataSet('test:TestDataset')

for x in list.tables():
   if(re.match('table1(.*)',x.name.table_id)):
     value = 'test:TestDataset.'+ x.name.table_id

然后我尝试使用 bq cp命令将表从一个数据集复制到另一个数据集。但是我无法在笔记本中执行bq命令。

Then i tried using the "bq cp" command to copy table from one dataset to another. But I cannot execute the bq command in the notebook.

!bq cp $value proj1:test1.table1_20162020

注意:

我尝试使用bigquery命令检查是否存在与之关联的复制命令,但找不到任何。

I tried with bigquery command to check whether there is a copy command associated with it but could not find any.

推荐答案

我创建了以下脚本,通过几次验证将所有表从一个数据集复制到另一个数据集。

I have created following script to copying all the tables from one dataset to another dataset with couple of validation.

from google.cloud import bigquery

client = bigquery.Client()

projectFrom = 'source_project_id'
datasetFrom = 'source_dataset'

projectTo = 'destination_project_id'
datasetTo = 'destination_dataset'

# Creating dataset reference from google bigquery cient
dataset_from = client.dataset(dataset_id=datasetFrom, project=projectFrom)
dataset_to = client.dataset(dataset_id=datasetTo, project=projectTo)

for source_table_ref in client.list_dataset_tables(dataset=dataset_from):
    # Destination table reference
    destination_table_ref = dataset_to.table(source_table_ref.table_id)

    job = client.copy_table(
      source_table_ref,
      destination_table_ref)

    job.result()
    assert job.state == 'DONE'

    dest_table = client.get_table(destination_table_ref)
    source_table = client.get_table(source_table_ref)

    assert dest_table.num_rows > 0 # validation 1  
    assert dest_table.num_rows == source_table.num_rows # validation 2

    print ("Source - table: {} row count {}".format(source_table.table_id,source_table.num_rows ))
    print ("Destination - table: {} row count {}".format(dest_table.table_id, dest_table.num_rows))

这篇关于在Google Big查询中将表格从一个数据集复制到另一个数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆