BigQuery流式传输和分区:何时真正评估_PARTITIONTIME? [英] BigQuery streaming and partitions: when is _PARTITIONTIME really evaluated?

查看:48
本文介绍了BigQuery流式传输和分区:何时真正评估_PARTITIONTIME?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

_PARTITIONTIME表示将行插入BigQuery的时间(截断到一天).

_PARTITIONTIME represents the time (truncated to the day) when a row is inserted into BigQuery.

但是,当仔细观察流式传输机制时(

However, when looking closely at the streaming mechanism (https://cloud.google.com/blog/products/gcp/life-of-a-bigquery-streaming-insert ), we can see 3 different "insertion times" when a row is inserted into BigQuery:

  • 流式提取工作者"收到该行的时间
  • 将行存储到流缓冲"中的时间
  • 该行处于提取状态的时间,工作人员将其存储到最终(电容器)存储中.

有人知道这三个时刻中的哪一个对应于_PARTITIONTIME吗?

Does somebody knows which one of those 3 moments correspond to _PARTITIONTIME ?

推荐答案

当行仍在流缓冲区中时,此行的_PARTITIONTIME为空;提取该行之后,提取时间为该行的_PARTITIONTIME.一个例外是,当该行直接流式传输到分区时,为"table $ 20180101".在这种情况下,_PARTITIONTIME始终为"2018-01-01".

When the row is still in the streaming buffer, _PARTITIONTIME is null for this row; after the row is extracted, the extraction time is the _PARTITIONTIME for this row. An exception is that when the row is streamed into a partition directly, "table$20180101". In this case the _PARTITIONTIME is always "2018-01-01".

这篇关于BigQuery流式传输和分区:何时真正评估_PARTITIONTIME?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆