Google BQ - 如何在表中更新现有数据? [英] Google BQ - how to upsert existing data in tables?

查看:24
本文介绍了Google BQ - 如何在表中更新现有数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 Python 客户端库在 BigQuery 表中加载数据.我需要更新这些表中一些更改的行.但我不知道如何正确更新它们?我想要一些类似的 UPSERT 函数 - 仅当它不存在时插入行,否则 - 更新现有行.

I'm using Python client library for loading data in BigQuery tables. I need to update some changed rows in those tables. But I couldn't figure out how to correctly update them? I want some similar UPSERT function - insert row only if its not exists, otherwise - update existing row.

在表中使用带有校验和的特殊字段(并在加载过程中比较总和)是否正确?如果有一个好主意,如何用 Python 客户端解决这个问题?(据我所知,它无法更新现有数据)

Is it the right way to use a special field with checksum in tables (and compare sum in loading process)? If there is a good idea, how to solve this with Python client? (As I know, it can't update existing data)

请解释一下,最佳做法是什么?

Please explain me, what's the best practice?

推荐答案

BigQuery 在设计上是仅附加的首选.这意味着您最好让来自表中同一实体的重复行并编写查询以始终读取最近的行.

BigQuery is by design append-only preferred. That means that you better let duplicate rows from the same entity in the table and write your queries to always read most recent row.

更新事务表中的行,可能有限制.您的项目每天最多可以对每个表进行 1,500 次表操作.这是非常有限的,它们的目的完全不同.1 次操作可以触及多行,但每天每表仍然 1500 次操作.因此,如果您希望对行进行单独更新,那是行不通的,因为它限制为每天 1500 行.

Updating rows as you know in transactional tables possible with limitations. Your project can make up to 1,500 table operations per table per day. That's very limited and their purpose is totally different. 1 operation can touch multiple rows, but still 1500 operation per table per day. So if you want individual updates to rows, that's not working out as it limits to 1500 rows per day.

由于 BQ 用作数据湖,因此您应该在每次用户时流式传输新行,例如:更新他们的个人资料.您最终将为同一用户保存 20 行 20 行.稍后您可以通过删除重复数据来重新组合您的表以具有唯一的行.

Since BQ is used as data lake, you should just stream new rows every time the user eg: updates their profile. You will end up having from 20 saves 20 rows for the same user. Later you can rematerilize your table to have unique rows by removing duplicate data.

查看后面最多的问题:BigQuery - 删除重复项的 DELETE 语句

这篇关于Google BQ - 如何在表中更新现有数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆