如何在 hive 中获得毫秒精度? [英] How do I get millisecond precision in hive?

查看:25
本文介绍了如何在 hive 中获得毫秒精度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

文档 说时间戳支持以下转换:

The documentation says that timestamps support the following conversion:

•浮点数字类型:解释为具有十进制精度的 UNIX 时间戳(以秒为单位)

•Floating point numeric types: Interpreted as UNIX timestamp in seconds with decimal precision

首先,我不确定如何解释这一点.如果我有一个时间戳 2013-01-01 12:00:00.423,我可以将其转换为保留毫秒的数字类型吗?因为那是我想要的.

First of all, I'm not sure how to interpret this. If I have a timestamp 2013-01-01 12:00:00.423, can I convert this to a numeric type that retains the milliseconds? Because that is what I want.

更一般地说,我需要在时间戳之间进行比较,例如

More generally, I need to do comparisons between timestamps such as

select maxts - mints as latency from mytable

其中 maxtsmints 是时间戳列.目前,这给了我 NullPointerException 使用 Hive 0.11.0.如果我做类似的事情,我就可以执行查询

where maxts and mints are timestamp columns. Currently, this gives me NullPointerException using Hive 0.11.0. I am able to perform queries if I do something like

select unix_timestamp(maxts) - unix_timestamp(mints) as latency from mytable

但这仅适用于秒,而不是毫秒精度.

but this only works for seconds, not millisecond precision.

任何帮助表示赞赏.如果您需要其他信息,请告诉我.

Any help appreciated. Tell me if you need additional information.

推荐答案

如果您想使用毫秒,请不要使用 unix 时间戳函数,因为它们将日期视为自纪元以来的秒数.

If you want to work with milliseconds, don't use the unix timestamp functions because these consider date as seconds since epoch.

hive> describe function extended unix_timestamp;
unix_timestamp([date[, pattern]]) - Returns the UNIX timestamp
Converts the current or specified time to number of seconds since 1970-01-01.

相反,将 JDBC 兼容时间戳 转换为双.
例如:

Instead, convert the JDBC compliant timestamp to double.
E.g:

给定一个制表符分隔的数据:

Given a tab delimited data:

cat /user/hive/ts/data.txt :
a   2013-01-01 12:00:00.423   2013-01-01 12:00:00.433
b   2013-01-01 12:00:00.423   2013-01-01 12:00:00.733

CREATE EXTERNAL TABLE ts (txt string, st Timestamp, et Timestamp) 
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '	'
LOCATION '/user/hive/ts';

那么你可以查询 startTime(st) 和 endTime(et) 之间的毫秒差异,如下所示:

Then you may query the difference between startTime(st) and endTime(et) in milliseconds as follows:

select 
  txt, 
  cast(
    round(
      cast((e-s) as double) * 1000
    ) as int
  ) latency 
from (select txt, cast(st as double) s, cast(et as double) e from ts) q;

这篇关于如何在 hive 中获得毫秒精度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆