在Hive中加入日期范围中的表 [英] Join Tables on Date Range in Hive

查看:1183
本文介绍了在Hive中加入日期范围中的表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要在employee_id上​​将tableA连接到tableB,并且表A中的cal_date需要在表B中的日期开始日期和结束日期之间。我跑到查询下方并收到下面的错误消息,请你帮我纠正和查询。感谢您的帮助!

I need to join tableA to tableB on employee_id and the cal_date from table A need to be between date start and date end from table B. I ran below query and received below error message, Would you please help me to correct and query. Thank you for you help!

在JOIN'date_start'中遇到左右别名

select a.*, b.skill_group 
from tableA a 
  left join tableB b 
    on a.employee_id= b.employee_id 
    and a.cal_date >= b.date_start 
    and a.cal_date <= b.date_end


推荐答案

RTFM - 引用 LanguageManual加入

RTFM - quoting LanguageManual Joins


Hive不支持不等于条件
的连接条件,因为它很难表达您可以尝试将BETWEEN过滤器移到WHERE子句中,从而生成一个映射/减少
作业的条件。

Hive does not support join conditions that are not equality conditions as it is very difficult to express such conditions as a map/reduce job.

在糟糕的部分笛卡尔连接之后进行后处理清理。呸。根据你的技能组表的实际基数,它可能工作得很快 - 或者需要整整一天。

You may try to move the BETWEEN filter to a WHERE clause, resulting in a lousy partially-cartesian-join followed by a post-processing cleanup. Yuck. Depending on the actual cardinality of your "skill group" table, it may work fast - or take whole days.

这篇关于在Hive中加入日期范围中的表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆