BigQuery不处理分区列中的时间戳(以毫秒为单位) [英] BigQuery not dealing with timestamp in millisecond with partition column

查看:73
本文介绍了BigQuery不处理分区列中的时间戳(以毫秒为单位)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的csv文件中有一个unix时间戳列,以毫秒为单位表示.现在,当我将这些数据插入到bigQuery表中并对其进行查询时,会出现此错误

I have a unix timestamp column which is represented in millisecond in my csv file. Now when I insert this data in my bigQuery table and query it I get this error

bigQuery不支持毫秒级时间戳

现在,我想将此列作为分区列. 我有几个问题 1)即使将其保存为int64,如何在此字段上创建一个分区列? 2)我想避免重复的表.

Now I would like to make this column as a partition column. I have a few questions 1) Even if I save it as int64, how can I make a partition column on this field? 2) I would like to avoid duplicate tables.

推荐答案

如果时间戳数据以毫秒表示,则将无法正确创建分区表.相反,您应该使用@TimBiegeleisen所述的"TIMESTAMP或DATE列".时间戳将使用微秒精度.一旦您的列以微秒为单位,您就可以使用类似于以下内容的方法来创建分区表:

If your timestamp data is represented in milliseconds, you won't be able to properly create the Partitioned table. Instead you should use a "TIMESTAMP or DATE column" as stated by @TimBiegeleisen. Timestamp will use microsecond precision. Once your column is in microsecond you can use something like the following to create the partitioned table:

bq load --schema <your-timestamp-column>:TIMESTAMP,<some-other-column>:FLOAT --skip_leading_rows=1 --source_format=CSV --time_partitioning_field=<your-timestamp-column> <your-dataset>.<your-table> <your-csv-file>

(如果csv文件中有列名,请使用--skip_leading_rows.)

(use --skip_leading_rows if you have the column names in the csv file.)

使用标准SQL而不是旧版查询表,因为您可以在官方网站上看到此处的文档:

Query your table using Standard SQL not Legacy, as you can see in official docs here:

您不能使用旧版SQL查询分区表或将查询结果写入分区表.

You cannot use legacy SQL to query partitioned tables or to write query results to partitioned tables.

这篇关于BigQuery不处理分区列中的时间戳(以毫秒为单位)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆