当流缓冲区不为空时,BigQuery流和删除吗? [英] BigQuery Stream and Delete while streaming buffer is not empty?

查看:72
本文介绍了当流缓冲区不为空时,BigQuery流和删除吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

BigQuery不会直接将其流式传输到其长期存储中,而是先将其放入经过写优化的存储中,然后定期将其刷新到主存储中.

BigQuery doesn't stream directly into their long term storage, they first put it into a write optimized store and periodically flush that to the main storage.

在以下用例中,我想更好地理解BigQuery Streaming缓冲区.

I would like to understand BigQuery Streaming buffer better, in the following use cases.

1)如果某些记录仍留在流缓冲区中等待刷新到主存储器中时,如果我删除bigquery表并立即重新创建一个具有相同名称的新bigquery表,该怎么办?

1) what if I delete the bigquery table, and recreate a new bigquery table with the same name right away, when some records still stay in streaming buffer waiting to be flushed into main storage?

例如,如果我要将一百万条记录流式传输到BigQuery中.现在,某些记录仍留在流式缓冲区中,等待刷新到BigQuery的主存储中.

For example, if I am streaming a million records into BigQuery. Some of the records still stay in the streaming buffer now, waiting to be flushed to BigQuery's main storage.

这时,我删除BigQuery表并使用相同的名称重新创建BigQuery表,流缓冲区中的其余记录是否仍会刷新到新的重新创建的表中?否则流缓冲区中的剩余记录将被丢弃?

At this time, I delete the BigQuery table and recreate the BigQuery table with the same name, would the remain records in the streaming buffer still be flushed into the new recreated table? Or the remain records in the streaming buffer will be dropped?

我的猜测是流缓冲区中的剩余记录将被删除吗?我的猜测是,即使删除表并重新创建具有相同名称的表,旧表和新表的对象ID"也应该不同.

My guess is that remain records in the streaming buffer will be dropped? My guess is even if delete the table and recreate a table with the same name, the "object id" for the old table and new table should be different.

我正确吗?

2)如果我运行删除查询试图删除我之前流过的某些记录怎么办?

2) what if I run delete query trying to delete some records that I had just streamed before?

与上述相同,如果我流式传输100万条记录,其中一些仍保留在流式缓冲区中,这时,我发出一个delete sql,它应该删除我刚刚流式传输的一些记录.

Same as above, if I streaming 1 million records, some of them still stay in the streaming buffer, at this time, I issue a delete sql which should delete some records I just streamed.

但是,如果我要删除的记录仍在流式缓冲区中,并且在我发送delete sql命令时正等待刷新到主存储中,那么我的delete sql将无法删除它们(它们不在BigQuery主存储中),然后再将这些记录刷新到主存储中.这意味着我的删除sql将无法删除这些记录.

But if the records I want to delete are still in streaming buffer waiting to be flushed into main storage when I send delete sql command, so my delete sql will not be able to delete them (they are not in the BigQuery main storage yet), and then later on, these records will be flushed into main storage. That means my delete sql will fail to delete these records.

我正确吗?如果我是正确的,那么为了使删除sql正常工作,我必须在发出delete sql之前确定流缓冲区是否为空.这会使事情变得更复杂.

Am I correct? If I am correct, then for my delete sql to work, I have to find out if streaming buffer is empty before I issue delete sql? That will make things more complicated.

谢谢!

推荐答案

1)正确.对象ID"不同,剩余的记录将被删除.

1) Correct. The "object id" is different and remaining records will be dropped.

2)正确的种类.DML语句无法修改仍在流缓冲区中的数据.但是,如果该语句尝试触摸仍在流缓冲区中的行,则该语句将失败.

2) Kinds of correct. DML statement cannot modify data still in stream buffer. However, the statement will fail if it tries to touch rows still in stream buffer.

这篇关于当流缓冲区不为空时,BigQuery流和删除吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆