Datalab不会填充bigQuery表 [英] Datalab does not populate bigQuery tables

查看:93
本文介绍了Datalab不会填充bigQuery表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我想将一个表的结果写入一个bigQuery表中,但它不起作用,任何人说使用insert_data(dataframe)函数,但它不填充我的表。
为了简化问题,我尝试读取一个表并将其写入刚创建的表(具有相同的模式),但它不起作用。任何人都可以告诉我我在哪里错了吗?

  import gcp 
import gcp.bigquery as bq
$
df = bq.Query('SELECT 1 as a,2 as b FROM [publicdata:samples.wikipedia] LIMIT 3')。to_dataframe()

#创建数据集并提取模式
dataset = bq.DataSet('prova1')
dataset.create(friendly_name ='aaa',description ='bbb')
schema = bq.Schema.from_dataframe(df)

创建表
temptable = bq.Table('prova1.prova2')。create(schema = schema,overwrite = True)

#我尝试将相同的数据放入刚创建的$ b $ temptable.insert_data(df)


解决方案

调用insert_data将执行HTTP POST并返回一次。但是,数据显示在BQ表中可能需要一段时间(最长可达几分钟)。尝试在使用表格之前等一会儿。我们或许可以在以后的更新中解决这个问题,看到这个



直到现在准备就绪的黑客方式应该是这样的:

 进口时间
而真:
info = temptable._api.tables_get(temptable._name_parts)
如果'streamingBuffer'不在info:
break
如果info ['streamingBuffer' ] ['estimatedRows']> 0:
break
time.sleep(5)


Hi I have a problem while using ipython notebooks on datalab.

I want to write the result of a table into a bigQuery table but it does not work and anyone says to use the insert_data(dataframe) function but it does not populate my table. To simplify the problem I try to read a table and write it to a just created table (with the same schema) but it does not work. Can anyone tell me where I am wrong?

import gcp
import gcp.bigquery as bq

#read the data
df = bq.Query('SELECT 1 as a, 2 as b FROM [publicdata:samples.wikipedia] LIMIT 3').to_dataframe()

#creation of a dataset and extraction of the schema
dataset = bq.DataSet('prova1')
dataset.create(friendly_name='aaa', description='bbb')
schema = bq.Schema.from_dataframe(df)

#creation of the table
temptable = bq.Table('prova1.prova2').create(schema=schema, overwrite=True)

#I try to put the same data into the temptable just created
temptable.insert_data(df)

解决方案

Calling insert_data will do a HTTP POST and return once that is done. However, it can take some time for the data to show up in the BQ table (up to several minutes). Try wait a while before using the table. We may be able to address this in a future update, see this

The hacky way to block until ready right now should be something like:

import time
while True:
  info = temptable._api.tables_get(temptable._name_parts)
  if 'streamingBuffer' not in info:
    break
  if info['streamingBuffer']['estimatedRows'] > 0:
    break
  time.sleep(5)

这篇关于Datalab不会填充bigQuery表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆