我如何在配置单元中获得毫秒精度? [英] How do I get millisecond precision in hive?

查看:84
本文介绍了我如何在配置单元中获得毫秒精度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

文档说时间戳支持以下转换:
$ b

•浮点数值类型:解释为具有小数精度的UNIX时间戳记(秒)

首先,我不确定如何解释这一点。如果我有时间戳2013-01-01 12:00:00.423,我可以将其转换为保留毫秒的数字类型吗?因为这就是我想要的。



更一般地说,我需要在时间戳之间进行比较,比如

  select maxts  -  mints as mytable 

其中 maxts 是时间戳列。目前,这给我 NullPointerException 使用Hive 0.11.0。我能够执行查询,如果我做了类似于

  select unix_timestamp(maxts) -  unix_timestamp(mints)作为mytable 

但这只适用于秒,而不是毫秒精度。

任何帮助表示赞赏。告诉我你是否需要额外的信息。

解决方案

如果你想使用毫秒,不要使用unix timestamp函数,因为这些考虑日期为自从时代以来的秒数。

  hive>描述函数扩展的unix_timestamp; 
unix_timestamp([date [,pattern]]) - 返回UNIX时间戳
将当前或指定时间转换为1970-01-01以来的秒数。

相反,将符合JDBC的时间戳记可以翻倍。

例如:

<给出一个制表符分隔的数据:

  cat /user/hive/ts/data.txt:
a 2013- 01-01 12:00:00.423 2013-01-01 12:00:00.433
b 2013-01-01 12:00:00.423 2013-01-01 12:00:00.733

CREATE EXTERNAL TABLE ts(txt string,st Timestamp,et Timestamp)
ROW FORMAT DELIMITED
终止'\ t'
的位置LOCATION'/ user / hive / ts';

然后你可以以毫秒为单位查询startTime(st)和endTime(et)之间的差异,如下所示:

  select 
txt,
cast(
round(
cast(( es)double)* 1000
)as int
)延迟
from(select txt,cast(st as double)s,cast(et double)e from ts)q;


The documentation says that timestamps support the following conversion:

•Floating point numeric types: Interpreted as UNIX timestamp in seconds with decimal precision

First of all, I'm not sure how to interpret this. If I have a timestamp 2013-01-01 12:00:00.423, can I convert this to a numeric type that retains the milliseconds? Because that is what I want.

More generally, I need to do comparisons between timestamps such as

select maxts - mints as latency from mytable

where maxts and mints are timestamp columns. Currently, this gives me NullPointerException using Hive 0.11.0. I am able to perform queries if I do something like

select unix_timestamp(maxts) - unix_timestamp(mints) as latency from mytable

but this only works for seconds, not millisecond precision.

Any help appreciated. Tell me if you need additional information.

解决方案

If you want to work with milliseconds, don't use the unix timestamp functions because these consider date as seconds since epoch.

hive> describe function extended unix_timestamp;
unix_timestamp([date[, pattern]]) - Returns the UNIX timestamp
Converts the current or specified time to number of seconds since 1970-01-01.

Instead, convert the JDBC compliant timestamp to double.
E.g:

Given a tab delimited data:

cat /user/hive/ts/data.txt :
a   2013-01-01 12:00:00.423   2013-01-01 12:00:00.433
b   2013-01-01 12:00:00.423   2013-01-01 12:00:00.733

CREATE EXTERNAL TABLE ts (txt string, st Timestamp, et Timestamp) 
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION '/user/hive/ts';

Then you may query the difference between startTime(st) and endTime(et) in milliseconds as follows:

select 
  txt, 
  cast(
    round(
      cast((e-s) as double) * 1000
    ) as int
  ) latency 
from (select txt, cast(st as double) s, cast(et as double) e from ts) q;

这篇关于我如何在配置单元中获得毫秒精度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆