BigQuery 不使用分区列处理以毫秒为单位的时间戳 [英] BigQuery not dealing with timestamp in millisecond with partition column

查看:27
本文介绍了BigQuery 不使用分区列处理以毫秒为单位的时间戳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 unix 时间戳列,它在我的 csv 文件中以毫秒表示.现在,当我在 bigQuery 表中插入此数据并进行查询时,出现此错误

I have a unix timestamp column which is represented in millisecond in my csv file. Now when I insert this data in my bigQuery table and query it I get this error

bigQuery 不支持毫秒时间戳

现在我想将此列作为分区列.我有几个问题1)即使我保存为int64,我如何在这个字段上做一个分区列?2)我想避免重复的表.

Now I would like to make this column as a partition column. I have a few questions 1) Even if I save it as int64, how can I make a partition column on this field? 2) I would like to avoid duplicate tables.

推荐答案

如果您的时间戳数据以毫秒表示,您将无法正确创建分区表.相反,您应该使用@TimBiegeleisen 所述的TIMESTAMP 或 DATE 列".时间戳 将使用微秒精度.一旦您的列以微秒为单位,您就可以使用以下内容来创建分区表:

If your timestamp data is represented in milliseconds, you won't be able to properly create the Partitioned table. Instead you should use a "TIMESTAMP or DATE column" as stated by @TimBiegeleisen. Timestamp will use microsecond precision. Once your column is in microsecond you can use something like the following to create the partitioned table:

bq load --schema <your-timestamp-column>:TIMESTAMP,<some-other-column>:FLOAT --skip_leading_rows=1 --source_format=CSV --time_partitioning_field=<your-timestamp-column> <your-dataset>.<your-table> <your-csv-file>

(如果 csv 文件中有列名,请使用 --skip_leading_rows.)

(use --skip_leading_rows if you have the column names in the csv file.)

使用标准 SQL 而不是 Legacy 查询您的表,正如您在官方中看到的 文档在这里:

Query your table using Standard SQL not Legacy, as you can see in official docs here:

您不能使用旧版 SQL 查询分区表或将查询结果写入分区表.

You cannot use legacy SQL to query partitioned tables or to write query results to partitioned tables.

这篇关于BigQuery 不使用分区列处理以毫秒为单位的时间戳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆