配置单元 - 使用半连接时不存在 [英] Hive - Use NOT Exists in Using Semi Join

查看:149
本文介绍了配置单元 - 使用半连接时不存在的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我有3个表A,B和C.



带有PRODUCT,ID和VALUE字段的B。
C带有字段ID和VALUE。



我需要将表C中没有匹配的ID和VALUE字段的行写入表中A。

INSERT OVERWRITE TABLE A a
SELECT *
FROM B b
LEFT SEMI JOIN C c ON(b.ID = c.ID AND b.VALUE = c.VALUE)其中b.ID = NULL AND b.VALUE = NULL;



来自 http://stackoverflow.com/questions/25041026/hive-left-semi-join-for-not-exists 的这个建议不起作用,因为我在WHERE子句中引用右侧表,不应该这样做。



如何在不引用WHERE子句中的右侧表的情况下形成等效查询。



任何其他解决方案解决方案:

检查目标表拥有所有的来自两个表的字段。因为,在这里使用*。



然后,
它应该是b.VALUE IS NUL L而不是 = NULL



查询应该是这样的:

  INSERT OVERWRITE TABLE A a 
SELECT * FROM B b
LEFT SEMI JOIN C c
ON(b.ID = c.ID AND b.VALUE = c.VALUE)其中
b.ID IS NULL和b.VALUE是NULL;


I need to use NOT IN query in Hive.

I have 3 tables A, B and C.

B with fields PRODUCT, ID and VALUE. C with fields ID and VALUE.

I need to write the rows from table B, which has no matching ID and VALUE fields in table C, to table A.

INSERT OVERWRITE TABLE A a SELECT * FROM B b LEFT SEMI JOIN C c ON (b.ID = c.ID AND b.VALUE = c.VALUE) where b.ID = NULL AND b.VALUE = NULL;

This suggestion from http://stackoverflow.com/questions/25041026/hive-left-semi-join-for-not-exists is not working, as I referred the right side table in WHERE clause, which should not be done.

How to form the equivalent query without referrring the right side table in the WHERE clause.

Any other solution?

解决方案

Solution:

Check the target tables have all the fields from both the tables. Because, here used *.

Then, It should be b.VALUE IS NULL and not = NULL.

The query should be like this:

INSERT OVERWRITE TABLE A a 
SELECT * FROM B b 
LEFT SEMI JOIN C c 
ON (b.ID = c.ID AND b.VALUE = c.VALUE) where 
b.ID IS NULL AND b.VALUE IS NULL;

这篇关于配置单元 - 使用半连接时不存在的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆