Python:如何在不到40秒的时间内更新Google BigQuery中的值? [英] Python: How to update a value in Google BigQuery in less than 40 seconds?

查看:30
本文介绍了Python:如何在不到40秒的时间内更新Google BigQuery中的值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Google BigQuery 中有一个表格,可以使用 pandas 函数 read_gbq to_gbq 在Python中进行访问和修改代码>.问题在于,添加100,000行需要大约150秒,而添加1行需要大约40秒.我想更新表中的值而不是添加一行,是否有一种方法可以使用python快速或快于40秒来更新表中的值?

I have a table in Google BigQuery that I access and modify in Python using the pandas functions read_gbq and to_gbq. The problem is that appending 100,000 lines takes about 150 seconds while appending 1 line takes about 40 seconds. I would like to update a value in the table rather than append a line, is there a way to update a value in the table using python that is very fast, or faster than 40 seconds?

推荐答案

不确定是否可以使用 pandas 这样做,但是您可以使用 google-cloud 库.

Not sure if you can do so using pandas but you sure can using google-cloud library.

您可以只安装它( pip install --upgrade google-cloud )并像这样运行它:

You could just install it (pip install --upgrade google-cloud) and run it like:

import uuid
import os
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path_to_json_credentials.json'
from google.cloud.bigquery.client import Client

bq_client = Client()

job_id = str(uuid.uuid4())
query = """UPDATE `dataset.table` SET field_1 = '3' WHERE field_2 = '1'"""
job = bq_client.run_async_query(query=query, job_name=job_id)
job.use_legacy_sql = False
job.begin()

此操作平均需要2秒钟.

Here this operation is taking 2s on average.

请注意,请务必注意与 quotas 相关的内容BQ中的DML操作,即知道何时使用它们以及它们是否完全适合您的需求.

As a side note, it's important to keep in mind the quotas related to DML operations in BQ, that is, know when it's appropriate to use them and if they fit your needs well.

这篇关于Python:如何在不到40秒的时间内更新Google BigQuery中的值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆