Hive查询中的DATE操作 [英] Hive DATE manipulation in a query

查看:1005
本文介绍了Hive查询中的DATE操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图运行一个Hive QUERy
我有一张表,让我们说3列。
其中之一是包含数据的日期列:

广告2014-04-01

be 2014-04-03

cf 2014-04-20



现在我想从上述数据中挑选最大日期,并与当前日期进行区别(当前日期假设为2014-04-24)并将差异添加到输出。
我的意思是;查询应该选择2014-04-20,并用当前日期减去它以将输出作为4,然后将此差异添加到所有日期以获得输出:

广告2014-04-05

be 2014-04-07

cf 2014-04-24



我试过这个,但是它会遇到语义问题:

lockquote

select A,B,date_add(SOMEDATE,datediff(to_date(
FROM_UNIXTIME(UNIX_TIMESTAMP( ))),max(SOMEDATE)))As SOMEDATE



解决方案

可以使用Hive date udf's (DATEDIFF,FROM_UNIXTIME,UNIX_TIMESTAMP,DATE_ADD):
https ://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions



假设您的源表定义为: DateSource( col1 string,col2 string,myDate string)



查询将会是:

  SELECT col1,col2,myDate,DATE_ADD(myDate,daysDiff)as adjustedDate 
FROM DateSource
JOIN

SELECT DATEDIFF(FROM_UNIXTIME(UNIX_TIMESTAMP() ,yyyy-MM-dd),maxDate)as daysDiff
FROM

SELECT max(myDate)as maxDate FROM DateSource
)maxDate
)diffDate;


I am trying to run a Hive QUERy I have a table with lets say 3 columns. Of one is a date column with data as :

a d 2014-04-01
b e 2014-04-03
c f 2014-04-20

Now I want to pick the Maximum date from the above data and do a difference with the current date ( current date lets assume is 2014-04-24) and add the difference to the output. What I mean is ; the query should pick 2014-04-20 and subtract it with current date to give an output as 4 and then add this difference to all the dates to have an output as :

a d 2014-04-05
b e 2014-04-07
c f 2014-04-24

I tried this but it runs into a semantic issue:

select A, B, date_add( SOMEDATE, datediff(to_date( FROM_UNIXTIME(UNIX_TIMESTAMP() )), max(SOMEDATE))) As SOMEDATE

解决方案

Doable using Hive date udf's (DATEDIFF, FROM_UNIXTIME, UNIX_TIMESTAMP, DATE_ADD): https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions

Assuming your source table definition is: DateSource(col1 string, col2 string, myDate string)

The query would be:

SELECT col1, col2, myDate, DATE_ADD(myDate,daysDiff) as adjustedDate 
FROM DateSource
JOIN
  (
    SELECT DATEDIFF(FROM_UNIXTIME(UNIX_TIMESTAMP(),"yyyy-MM-dd"),maxDate) as daysDiff 
      FROM 
       (
         SELECT max(myDate) as maxDate FROM DateSource
       ) maxDate
  ) diffDate;

这篇关于Hive查询中的DATE操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆