在 BigQuery 中使用流缓冲区更新或删除表? [英] Update or Delete tables with streaming buffer in BigQuery?

查看:22
本文介绍了在 BigQuery 中使用流缓冲区更新或删除表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试从通过 GCP Console 创建并使用 GCP BigQuery Node.js 表插入功能更新的表中删除记录时,我收到以下错误.

I'm getting this following error when trying to delete records from a table created through GCP Console and updated with GCP BigQuery Node.js table insert function.

UPDATE 或 DELETE DML 语句不受表 stackdriver-360-150317:my_dataset.users 和流缓冲区的支持

该表是在没有流功能的情况下创建的.从我在文档中阅读的内容 Tables最近通过 BigQuery Streaming (tabledata.insertall) 写入的数据无法使用 UPDATE 或 DELETE 语句进行修改.

The table was created without streaming features. And from what I'm reading in documentation Tables that have been written to recently via BigQuery Streaming (tabledata.insertall) cannot be modified using UPDATE or DELETE statements.

是不是说一旦用这个函数把一条记录插入到表中,就不能删除记录了?根本?如果是这种情况,是否意味着该表需要删除并从头开始重新创建?如果不是这样.你能提出一个解决方法来避免这个问题吗?

Does it mean that once a record has been inserted with this function into a table, there's no way to delete records? At all? If that's the case, does it mean that table needs to be deleted and recreated from scratch? If that's not the case. Can you please suggest a workaround to avoid this issue?

谢谢!

包括针对 SEO 的新错误消息:表上的 UPDATE 或 DELETE 语句......会影响流缓冲区中的行,这是不受支持的" -- Fh

Including new error message for SEO: "UPDATE or DELETE statement over table ... would affect rows in the streaming buffer, which is not supported" -- Fh

推荐答案

要检查表是否有流缓冲区,请检查 tables.get 响应中名为 streamingBuffer 或者,当流式传输到分区表时,流式缓冲区中的数据对于 _PARTITIONTIME 伪列具有 NULL 值,因此即使使用简单的 WHERE 查询也可以检查.

To check if the table has a streaming buffer, check the tables.get response for a section named streamingBuffer or, when streaming to a partitioned table, data in the streaming buffer has a NULL value for the _PARTITIONTIME pseudo column, so even with a simple WHERE query can be checked.

流式数据可用于实时分析第一次流式插入表的秒数,但最多可能需要 90 分钟才能用于复制/导出和其他操作.您可能最多需要等待 90 分钟,以便所有缓冲区都保留在集群上.您可以使用查询来查看流缓冲区是否为空,就像您提到的那样.

Streamed data is available for real-time analysis within a few seconds of the first streaming insertion into a table but it can take up to 90 minutes to become available for copy/export and other operations. You probably have to wait up to 90 minutes so all buffer is persisted on the cluster. You can use queries to see if the streaming buffer is empty or not like you mentioned.

如果您使用加载作业来创建表,您将没有流式缓冲区,但可能您向其中流式传输了一些值.

If you use load job to create the table, you won't have streaming buffer, but probably you streamed some values to it.

注意下面的答案以处理具有持续流缓冲区的表.只需使用 WHERE 过滤掉最近几分钟的数据,您的查询就会起作用. -- Fh

Note the answer below to work with tables that have ongoing streaming buffers. Just use a WHERE to filter out the latest minutes of data and your queries will work. -- Fh

这篇关于在 BigQuery 中使用流缓冲区更新或删除表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆