Hive Full Outer Join,在同一键,不同架构上具有4个表 [英] Hive Full Outer Join with 4 Tables on same Key, different schema

查看:1579
本文介绍了Hive Full Outer Join,在同一键,不同架构上具有4个表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试对4个Hive表进行完全外部联接. JOIN键是相同的,但是4个表的架构是不同的. 我想为4个表中存在的所有ID生成所有列值. 但是id列只应包含所有值一次出现,而不是4次(每次出现在一张表中)

I am trying to do Full Outer Join on 4 Hive tables. The JOIN key is same, but the schema of the 4 tables are different. I want to generate all the column values for all the ids present in the 4 tables. But the id column should be present only once with all values included, not 4 times(each from one table)

查询1

select count(*)
from table1 f FULL OUTER JOIN table2 u on f.id=u.id
FULL OUTER JOIN table3 v on f.id=v.id
FULL OUTER JOIN table4 v_in on f.id=v_in.id;

Count = 2787037

Count=2787037

查询2

select count(*)
from table1 f FULL OUTER JOIN table2 u on f.id=u.id
FULL OUTER JOIN table3 v on f.id=v.id
FULL OUTER JOIN table4 v_in on f.id=v_in.id
group by f.id,u.id,v.id,v_in.id, f.name, f.amt, f.add, u.dt, u.ts, v.ea,v.rd, 
v_in.c1,v_in.c2,v_in.c3,v_in.c4,v_in.c5;

Count = 2787037

Count=2787037

如何从一列中的4个表中生成所有ID值以及其他列值?

How to generate all the values of id from 4 tables in one column, along with other column values?

有更好的方法吗?

推荐答案

您应该只选择所需的列.我想您要coalesce():

You should just select the columns you want. I think you want coalesce():

select coalesce(f.id, u.id, v.id, v_in.id) as id,
       f.name, f.amt, f.add, u.dt, u.ts, v.ea, v.rd, 
       v_in.c1, v_in.c2, v_in.c3, v_in.c4, v_in.c5
from . . .;

使用full outer join,您需要很多coalesce():

select . . .
from table1 f full join
     table2 u
     on f.id = u.id full join
     table3 v
     on v.id in (f.id, u.id) full join
     table4 v_in
     on v_in.id in (f.id, u.id, v.id);

这篇关于Hive Full Outer Join,在同一键,不同架构上具有4个表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆