SQL Query JOIN 与表 [英] SQL Query JOIN with Table
问题描述
select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as
timestamps from testingtable2 LATERAL VIEW explode(purchased_item) exploded_table
as prod_and_ts;
通过使用上述查询,我得到以下输出.
By using the above query, I am getting the below output.
USER_ID | PRODUCT_ID | TIMESTAMPS
------------+------------------+-------------
1015826235 220003038067 1004841621
1015826235 300003861266 1005268799
1015826235 140002997245 1061569397
1015826235 *200002448035* 1005542471
如果将上面的查询输出与下面的Table2数据
进行比较,则上面输出的最后一行
中的product_id
是与下面 Table2
数据中最后一行的 ITEM_ID
不匹配.
If you compare the above output from the query with the below Table2 data
, then the product_id
in the last line of above output
is not matching with the ITEM_ID
in the last line in the below Table2
data.
BUYER_ID | ITEM_ID | CREATED_TIME
-------------+-------------------+------------------------
1015826235 220003038067 2001-11-03 19:40:21
1015826235 300003861266 2001-11-08 18:19:59
1015826235 140002997245 2003-08-22 09:23:17
1015826235 *210002448035* 2001-11-11 22:21:11
所以我的问题是
查找所有与特定 BUYER_ID 或 USER_ID 对应的 Table2
数据不匹配的 PRODUCT_ID(ITEM_ID)
和 TIMESTAMPS(CREATED_TIME)
.
Find all those PRODUCT_ID(ITEM_ID)
and TIMESTAMPS(CREATED_TIME)
that are not matching with Table2
data corresponding to particular BUYER_ID or USER_ID.
所以我需要为上面的例子显示这样的结果-
So I need to show the result like this for the above example-
BUYER_ID | ITEM_ID | CREATED_TIME | USER_ID | PRODUCT_ID | TIMESTAMPS
-----------+-------------------+-------------------------+---------------+------------------+------------------
1015826235 *210002448035* 2001-11-11 22:21:11 1015826235 *200002448035* 1005542471
我需要JOIN上面用table2写的查询才能得到上面的结果.所以我需要在 JOINING 过程中使用我上面的查询.这让我很困惑.任何建议将不胜感激.
I need to JOIN the above query that I wrote with table2 to get the above result. So I need to use my above query in the JOINING process. That is confusing me a lot. Any suggestion will be appreciated.
更新:-
我编写了以下查询,但不知何故我无法实现我想要实现的输出.有人能帮我解决这个问题吗?
I wrote the below query, but somehow I am not able to achieve the output that I wanted to achieve. Can anyone help me with this?
SELECT table2.buyer_id, table2.item_id, table2.created_time from
(select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as
timestamps from testingtable2 LATERAL VIEW explode(purchased_item) exploded_table
as prod_and_ts) prod_and_ts JOIN table2 where
prod_and_ts.user_id = table2.buyer_id
and (product_id <> table2.item_id or
timestamps <> UNIX_TIMESTAMP(table2.created_time));
推荐答案
我认为你可以用两个查询做你想做的事,但我不是 100% 确定.通常在这种情况下,在第一个表中找到与第二个表中不匹配的内容就足够了.您还试图获得最接近"的匹配,这就是为什么这具有挑战性.
I think you can do what you want with two queries, but I'm not 100% sure. Often in this situation, it is sufficient to find things in the first table that don't match in the second table. You are also trying to get a "closest" match, which is why this is challenging.
以下查询查找用户 ID 和其他两个字段之一的匹配项,然后将它们组合起来:
The following query looks for matches on user id and exactly one of the other two fields, and then combines them:
SELECT table2.buyer_id, table2.item_id, table2.created_time, prod_and_ts.*
from (select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as timestamps
from testingtable2 LATERAL VIEW
explode(purchased_item) exploded_table as prod_and_ts
) prod_and_ts JOIN
table2
on prod_and_ts.user_id = table2.buyer_id and
prod_and_ts.product_id = table2.item_id and
prod_and_ts.timestamps <> UNIX_TIMESTAMP(table2.created_time)
union all
SELECT table2.buyer_id, table2.item_id, table2.created_time, prod_and_ts.*
from (select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as timestamps
from testingtable2 LATERAL VIEW
explode(purchased_item) exploded_table as prod_and_ts
) prod_and_ts JOIN
table2
on prod_and_ts.user_id = table2.buyer_id and
prod_and_ts.product_id <> table2.item_id and
prod_and_ts.timestamps = UNIX_TIMESTAMP(table2.created_time)
这不会找到任一字段都没有匹配项的情况.
This will not find situations where there is no match on either field.
此外,我使用on"语法而不是where"来编写此代码.我假设 HIVE 支持这一点.
Also, I've written this using the "on" syntax rather than "where". I assume HIVE supports this.
这篇关于SQL Query JOIN 与表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!