时间戳是否在Apache Hive中使用时区存储? [英] Are timestamps stored with a timezone in Apache Hive?
问题描述
以下讨论似乎表明Hive时间戳有一个时区:
https://community.hortonworks.com/questions/83523/timestamp-in-hive-without-timezone.html
< apache维基解释说:时间戳被解释为无时间区域,并且存储为UNIX纪元的偏移量。
如果我使用如下代码:
from_unixtime(unix_timestamp(ts_field,'yyyy-MM-dd HH:mm:ss'),'yyyy-MM-dd HH:mm:ss z')as ts_field_tz
这似乎是揭露e是潜在的时区值。 它的意思实际上就是......
如果您有数据文件由Hive编写,那么
TIMESTAMP
values
表示数据写入的主机的本地时区
这是摘自 Impala文档 - 而且它们非常明确,因为当您需要从Hive 和 Impala访问同一个表时,这是一个真正的痛苦,因为与Hive相反。 。
$ b
默认情况下,Impala不会使用本地时区
存储时间戳,以避免来自意外时区问题的意外结果。
时间戳相对于UTC存储和解释
The following discussion seems to indicate that Hive timestamps have a timezone: https://community.hortonworks.com/questions/83523/timestamp-in-hive-without-timezone.html
The apache wiki says "Timestamps are interpreted to be timezoneless and stored as an offset from the UNIX epoch."
I am referring to: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-TimestampstimestampTimestamps
If I use code like the following:
from_unixtime(unix_timestamp(ts_field,'yyyy-MM-dd HH:mm:ss'), 'yyyy-MM-dd HH:mm:ss z') as ts_field_tz
This seems to expose an underlying timezone value.
The phrase "timezone-less" is misleading; what it means actually is that...
If you have data files written by Hive, those
TIMESTAMP
values represent the local timezone of the host where the data was written
That is an excerpt from the Impala documentation -- and they make it very explicit, because it's a real pain when you need to access the same table from both Hive and Impala, since contrary to Hive...
By default, Impala does not store timestamps using the local timezone, to avoid undesired results from unexpected time zone issues. Timestamps are stored and interpreted relative to UTC
这篇关于时间戳是否在Apache Hive中使用时区存储?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!