Avro日期和时间与BigQuery的兼容性? [英] Compatibility of Avro dates and times with BigQuery?

查看:280
本文介绍了Avro日期和时间与BigQuery的兼容性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

BigQuery通常在加载Avro数据方面做得很好,但是bq load在使用Avro logicalType属性的时间戳和其他日期/时间字段中遇到了很多麻烦。


  1. 当BigQuery TIMESTAMP将它们解释为微秒时间戳(关闭1000)时,我的数据与Avro类型timestamp-millis发生了冲突。

  2. 时间戳 - 可以加载到TIMESTAMP中的微型整数在BigQuery DATETIME中变为INVALID。我无法找到 https中有效的解释://cloud.google.com/bigquery/docs/reference/standard-sql/data-types

  3. ISO8601格式的字符串无法加载到TIMESTAMP或DATETIME(不兼容的类型错误),但我认为如果我加载纯JSON,BigQuery将支持。
    date类型无法加载到DATE中(也是不兼容的类型)。 b $ b

我想我可以通过总是将数据加载到临时字段并使用CAST查询或将它们转换为其他字段来解决这些问题,但这不会规模或支持模式演变或流很好。在Avro中使用定义良好的模式生成数据应该可以避免为不同消费者再次转换数据的额外步骤。

BigQuery真的与Avro的日期和时间不兼容吗? (或者我在做什么愚蠢的事情)



或者是bq载入这个问题吗?有没有更好的方式来加载Avro数据?

解决方案

BigQuery不支持逻辑类型。 BigQuery使用Apache库的C ++版本。我刚刚检查过,C ++库还没有支持逻辑类型。


BigQuery generally does a good job of loading Avro data, but "bq load" is having a lot of trouble with timestamps and other date/time fields that use the Avro logicalType attribute.

  1. My data with Avro type timestamp-millis is mangled when BigQuery TIMESTAMP interprets them as microsecond timestamps (off by 1000).
  2. A timestamp-micros integer that can load into TIMESTAMP becomes INVALID in a BigQuery DATETIME. I can't find an explanation of what would be valid at https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types
  3. Strings in ISO8601 format can't load into TIMESTAMP or DATETIME (Incompatible types error) but I think BigQuery would support that if I was loading plain JSON.
  4. Avro "date" type fails to load into DATE (also Incompatible types).

I guess I could workaround these problems by always loading the data into temporary fields and using queries to CAST or transform them to additional fields, but that doesn't scale or support schema evolution or stream nicely. Producing data in Avro with well-defined schemas is supposed to avoid that extra step of transforming data again for different consumers.

Is BigQuery really this incompatible with Avro dates and times? (or am I doing something dumb)

Or is "bq load" the problem here? Is there a better way to load Avro data?

解决方案

Logical types are not supported in BigQuery. BigQuery uses and C++ version of the Apache library. I just checked and the C++ library doesn't have support for logical types yet.

这篇关于Avro日期和时间与BigQuery的兼容性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆