Hive 和 SparkSQL 不支持日期时间类型? [英] Hive and SparkSQL do not support datetime type?

查看:82
本文介绍了Hive 和 SparkSQL 不支持日期时间类型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

HiveSparkSQL 真的不支持 datetime 的数据类型吗?

Is it true that Hive and SparkSQL do not support the datatype of datetime?

从我对参考文献的阅读来看,它们似乎只支持 datetimestamp.前者没有时间分量(即小时、分钟和秒);后者具有高精度(低至毫秒)但不是人类可读的(它总是需要通过 from_unixtime()date_format() 进行转换,结果将是一个字符串,而不是 datetime 类型).

From my reading of the references, they seem to support only date and timestamp. The former does not a time component (i.e. hour, minute, and second); the latter has high precision (down to millisecond) but is not very human readable (it always require a conversion by from_unixtime() or date_format(), and the result would be a string, not a datetime type).

相比之下,其他数据库系统,例如 MySQL 确实具有 datetime 的数据类型.(例如,请参阅此ref)

In contrast, other database systems, such as MySQL does have a datatype of datetime. (E.g. see this ref)

有什么好的建议/技巧可以解决这个限制吗?

Any good suggestions/tips how to work around this limitation?

这些是我的参考:

  1. Hive 语言手册:日期/时间类型
  2. Hive 语言手册:日期函数

更新:关于人类可读性

这里我举一个MySQL的例子来说明我的观点关于人类可读性:

Here I give an example on MySQL to illustrate my point about human-readability:

-- MySQL code 
select 
  cast(now() as date)         as asDate,          -- human readable
  cast(now() as dateTime)     as asDateTime,      -- human readable  
  unix_timestamp(now())       as asUnixTimestamp, -- not H/R
  cast(from_unixtime(unix_timestamp(now())) 
        as dateTime) 
                              asDateTimeAgain     -- cast into dateTime to make it H/R

显示如下:

(注意第四列asDateTimeAgain,这是人类可读的)

(Pay attention to the fourth column asDateTimeAgain, which is human readable)

+------------+---------------------+-----------------+---------------------+
| asDate     | asDateTime          | asUnixTimestamp | asDateTimeAgain     |
+------------+---------------------+-----------------+---------------------+
| 2017-10-21 | 2017-10-21 22:37:15 |      1508625435 | 2017-10-21 22:37:15 |
+------------+---------------------+-----------------+---------------------+
1 row in set (0.00 sec)

而且限制不仅仅是人类可读性.一个字符串datetime 的表示是人类可读,但随后它失去了 datetime 的属性和将需要进一步转换回datatime以进行日期/时间处理,例如min()max(),并将值捕获到java.util.Date

And the limitation is not just about human-readability. A string representation of datetime is human readable, but then it lost the property of datetime and will require further conversion back into datatime for date/time processing, such as min(), max(), and capturing the values into java.util.Date

-- Hive/SparkSQL  code
select 
  current_date                    asDate,
  unix_timestamp()                asUnixTimestamp,
  from_unixtime(unix_timestamp(), 
    'yyyy-MM-dd HH:mm:ss')        asString

输出是这样的,其中第三列是一个字符串而不是datetime 类型

The output would be this, where the third column is a string and not a datetime type

------------------------------------------------------
| asDate     | asUnixTimestamp | asString            |
| ---------- | --------------- | ------------------- |
| 2017-10-21 | 1508625586      | 2017-10-21 22:39:46 |
------------------------------------------------------

推荐答案

(我在这里提供答案)

不要将 timestamp 与unix 时间戳"混淆

Do not confuse timestamp with"unix timestamp"

timestamp 实际上是人类可读的;而unix时间戳",是自 1970 年 1 月 1 日以来的秒数/毫秒数确实是不太可读.

timestamp is actually human readable; while "unix timestamp", being the number of seconds/milliseconds since Jan 1, 1970, is indeed not very human readable.

但是,我们可以使用cast()来转换后者(通过函数from_unixtime())得到前者.

However, we can use cast() to convert the latter (through a function from_unixtime()) to get the former.

-- Hive/SparkSQL  code
select 
  current_date                    asDate,          -- human readable
  unix_timestamp()                asUnixTimestamp, -- not human readable
  from_unixtime(unix_timestamp()) asString,        -- human readable
  cast(from_unixtime(unix_timestamp())             
    as date)                      asDate2,         -- human readable
  cast(from_unixtime(unix_timestamp())             
    as timestamp)                 asTimestamp      -- human readable

结果:

-------------------------------------------------------------------------------------------
| asDate     | asUnixTimestamp | asString            | asDate2    | asTimestamp           |
| ---------- | --------------- | ------------------- | ---------- | --------------------- |
| 2017-10-22 | 1508687321      | 2017-10-22 15:48:41 | 2017-10-22 | 2017-10-22 15:48:41.0 |
-------------------------------------------------------------------------------------------

这篇关于Hive 和 SparkSQL 不支持日期时间类型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆