Hive中的日期差异小于15分钟 [英] Date Difference less than 15 minutes in Hive

查看:680
本文介绍了Hive中的日期差异小于15分钟的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下是我的查询,其中最后一行我试图查看日期之间的差异是否在15分钟内。但是每当我运行下面的查询。

pre $ SELECT TT.BUYER_ID,COUNT(*)FROM
(SELECT testingtable1。 buyer_id,testingtable1.item_id,testingtable1.created_time from(选择user_id,prod_and_ts.product_id作为product_id,prod_and_ts.timestamps作为来自testingtable2的时间戳)LATERAL VIEW爆炸(purchased_item)exploded_table作为prod_and_ts其中to_date(from_unixtime(cast(prod_and_ts.timestamps as BIGINT)) )='2012-07-09')prod_and_ts RIGHT OUTER JOIN(SELECT buyer_id,item_id,rank(buyer_id),created_time,UNIX_TIMESTAMP(created_time)$ b $ FROM(
SELECT buyer_id,item_id,created_time
FROM testingtable1
其中to_date(from_unixtime(cast(UNIX_TIMESTAMP(created_time)as int)))='2012-07-09'
DISTRIBUTE BY buyer_id
SORT BY buyer_id,created_time desc
)a
WHERE rank(buyer_id)< 5)testingtable1 ON(testingtable1.item_id = prod_and_ts.product_id AND testingtable1.BUYER_ID = prod_and_ts.USER_ID
AND abs(datediff(testingtable1.created_time,FROM_UNIXTIME(cast(prod_and_ts.timestamps as BIGINT))))<= 15)其中prod_and_ts.product_id IS NULL ORDER BY testingtable1.buyer_id,testingtable1.created_time desc)TT GROUP BY TT.BUYER_ID ;

我总是得到异常 -

  FAILED:语义分析错误:第10行:144左右别名
遇到加入15

我的查询有什么问题吗?或者在Hive中,我们无法计算分钟之间的日期差异?任何建议将不胜感激。

解决方案

我认为问题在于您的加入。从 Hive语言手册



Hive仅支持等式连接,外连接和左半连接。 Hive不支持不等于
条件的连接条件,因为很难将这种条件表达为
map / reduce作业。



Below is my query, in which in the last line I am trying to see if the difference between the dates is within 15 minutes. But whenever I run the below query.

SELECT TT.BUYER_ID , COUNT(*) FROM
(SELECT testingtable1.buyer_id, testingtable1.item_id, testingtable1.created_time from (select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as timestamps from testingtable2 LATERAL VIEW explode(purchased_item) exploded_table as prod_and_ts where to_date(from_unixtime(cast(prod_and_ts.timestamps as BIGINT))) = '2012-07-09') prod_and_ts RIGHT OUTER JOIN (SELECT buyer_id, item_id, rank(buyer_id), created_time, UNIX_TIMESTAMP(created_time)
FROM (
    SELECT buyer_id, item_id, created_time
    FROM testingtable1
    where to_date(from_unixtime(cast(UNIX_TIMESTAMP(created_time) as int))) = '2012-07-09'
    DISTRIBUTE BY buyer_id
    SORT BY buyer_id, created_time desc
) a
WHERE rank(buyer_id) < 5) testingtable1 ON (testingtable1.item_id = prod_and_ts.product_id AND testingtable1.BUYER_ID = prod_and_ts.USER_ID 
AND abs(datediff(testingtable1.created_time,FROM_UNIXTIME(cast(prod_and_ts.timestamps as BIGINT)))) <= 15) where prod_and_ts.product_id IS NULL ORDER BY testingtable1.buyer_id, testingtable1.created_time desc) TT GROUP BY TT.BUYER_ID;

I always get the exception as-

FAILED: Error in semantic analysis: line 10:144 Both Left and Right Aliases
Encountered in Join 15

Is there anything wrong with my query? Or in Hive we cannot calculate difference between dates in minutes? Any suggestions will be appreciated.

解决方案

I think the problem is with your join. From the the Hive language manual:

Only equality joins, outer joins, and left semi joins are supported in Hive. Hive does not support join conditions that are not equality conditions as it is very difficult to express such conditions as a map/reduce job.

这篇关于Hive中的日期差异小于15分钟的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆