在Hive中加入日期范围中的表 [英] Join Tables on Date Range in Hive
问题描述
我需要在employee_id上将tableA连接到tableB,并且表A中的cal_date需要在表B中的日期开始日期和结束日期之间。我跑到查询下方并收到下面的错误消息,请你帮我纠正和查询。感谢您的帮助!
I need to join tableA to tableB on employee_id and the cal_date from table A need to be between date start and date end from table B. I ran below query and received below error message, Would you please help me to correct and query. Thank you for you help!
在JOIN'date_start'中遇到左右别名。
select a.*, b.skill_group
from tableA a
left join tableB b
on a.employee_id= b.employee_id
and a.cal_date >= b.date_start
and a.cal_date <= b.date_end
推荐答案
RTFM - 引用 LanguageManual加入
RTFM - quoting LanguageManual Joins
Hive不支持不等于条件
的连接条件,因为它很难表达您可以尝试将BETWEEN过滤器移到WHERE子句中,从而生成一个映射/减少
作业的条件。
Hive does not support join conditions that are not equality conditions as it is very difficult to express such conditions as a map/reduce job.
在糟糕的部分笛卡尔连接之后进行后处理清理。呸。根据你的技能组表的实际基数,它可能工作得很快 - 或者需要整整一天。
You may try to move the BETWEEN filter to a WHERE clause, resulting in a lousy partially-cartesian-join followed by a post-processing cleanup. Yuck. Depending on the actual cardinality of your "skill group" table, it may work fast - or take whole days.
这篇关于在Hive中加入日期范围中的表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!