SQL查询与表联接 [英] SQL Query JOIN with Table
问题描述
选择user_id,prod_and_ts.product_id作为product_id,prod_and_ts.timestamps作为
来自testingtable2的时间戳LATERAL VIEW explode(purchased_item)exploded_table
作为prod_and_ts;
通过使用上面的查询,我得到了下面的输出。
USER_ID | PRODUCT_ID | TIMESTAMPS
------------ + ------------------ + -------------
1015826235 220003038067 1004841621
1015826235 300003861266 1005268799
1015826235 140002997245 1061569397
1015826235 * 200002448035 * 1005542471
如果您将查询的上述输出与下面的表2数据
进行比较,则 product_id $ c上面输出的最后一行
中的$ c>与最后一行中的
data。 ITEM_ID
不匹配, code> Table2
BUYER_ID | ITEM_ID | CREATED_TIME
------------- + ------------------- + ----------- -------------
1015826235 220003038067 2001-11-03 19:40:21
1015826235 300003861266 2001-11-08 18:19:59
1015826235 140002997245 2003-08-22 09:23:17
1015826235 * 210002448035 * 2001-11-11 22:21:11
所以我的问题是
查找所有 PRODUCT_ID(ITEM_ID)
和 TIMESTAMPS(CREATED_TIME)
,它们与特定BUYER_ID或USER_ID对应的 Table2
数据不匹配。 / p>
所以我需要在上面的例子中显示这样的结果 -
BUYER_ID | ITEM_ID | CREATED_TIME | USER_ID | PRODUCT_ID | TIMESTAMPS
----------- + ------------------- + ------------- ------------ + --------------- + ------------------ + - ----------------
1015826235 * 210002448035 * 2001-11-11 22:21:11 1015826235 * 200002448035 * 1005542471
我需要加入我用table2编写的上述查询以获得上述结果。所以我需要在JOINING过程中使用我的上述查询。这让我很困惑。任何建议将不胜感激。
更新: -
查询,但不知何故,我无法实现我想实现的输出。任何人都可以帮助我吗?
从$ b $选择table2.buyer_id,table2.item_id,table2.created_time user_id,prod_and_ts.product_id作为product_id,prod_and_ts.timestamps作为
来自testingtable2的时间戳LATERAL VIEW爆炸(purchased_item)exploded_table
作为prod_and_ts)prod_and_ts JOIN table2其中
prod_and_ts.user_id = table2.buyer_id
和(product_id<> table2.item_id或
timestamps<> UNIX_TIMESTAMP(table2.created_time));
我想你可以用两个查询来做你想做的事,但我不是100%确定的。通常在这种情况下,在第一个表中找到与第二个表中不匹配的东西就足够了。你也试图得到一个最接近的匹配,这就是为什么这是具有挑战性的。
以下查询查找用户id和其他两个匹配字段,然后将它们组合:
SELECT table2.buyer_id,table2.item_id,table2.created_time,prod_and_ts。*
from(选择user_id,prod_and_ts.product_id作为product_id,prod_and_ts.timestamps作为时间戳记testtable2中的
LATERAL VIEW
explode(purchased_item)exploded_table作为prod_and_ts
)prod_and_ts JOIN
table2
在prod_and_ts.user_id = table2.buyer_id和
prod_and_ts.product_id = table2.item_id和
prod_and_ts.timestamps<> UNIX_TIMESTAMP(table2.created_time)
union all
选择table2.buyer_id,table2.item_id,table2.created_time,prod_and_ts。*
from(select user_id,prod_and_ts.product_id as product_id,prod_and_ts.timestamps as timestamps
from testingtable2 LATERAL VIEW
explode(purchased_item)exploded_table as prod_and_ts
)prod_and_ts JOIN
table2
on prod_and_ts.user_id = table2.buyer_id和
prod_and_ts.product_id<> table2.item_id和
prod_and_ts.timestamps = UNIX_TIMESTAMP(table2.created_time)
这不会找到任何一个领域都没有匹配的情况。
另外,我使用on语法而不是where来写这个。我认为HIVE支持这一点。
select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as
timestamps from testingtable2 LATERAL VIEW explode(purchased_item) exploded_table
as prod_and_ts;
By using the above query, I am getting the below output.
USER_ID | PRODUCT_ID | TIMESTAMPS
------------+------------------+-------------
1015826235 220003038067 1004841621
1015826235 300003861266 1005268799
1015826235 140002997245 1061569397
1015826235 *200002448035* 1005542471
If you compare the above output from the query with the below Table2 data
, then the product_id
in the last line of above output
is not matching with the ITEM_ID
in the last line in the below Table2
data.
BUYER_ID | ITEM_ID | CREATED_TIME
-------------+-------------------+------------------------
1015826235 220003038067 2001-11-03 19:40:21
1015826235 300003861266 2001-11-08 18:19:59
1015826235 140002997245 2003-08-22 09:23:17
1015826235 *210002448035* 2001-11-11 22:21:11
So my question is
Find all those PRODUCT_ID(ITEM_ID)
and TIMESTAMPS(CREATED_TIME)
that are not matching with Table2
data corresponding to particular BUYER_ID or USER_ID.
So I need to show the result like this for the above example-
BUYER_ID | ITEM_ID | CREATED_TIME | USER_ID | PRODUCT_ID | TIMESTAMPS
-----------+-------------------+-------------------------+---------------+------------------+------------------
1015826235 *210002448035* 2001-11-11 22:21:11 1015826235 *200002448035* 1005542471
I need to JOIN the above query that I wrote with table2 to get the above result. So I need to use my above query in the JOINING process. That is confusing me a lot. Any suggestion will be appreciated.
UPDATE:-
I wrote the below query, but somehow I am not able to achieve the output that I wanted to achieve. Can anyone help me with this?
SELECT table2.buyer_id, table2.item_id, table2.created_time from
(select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as
timestamps from testingtable2 LATERAL VIEW explode(purchased_item) exploded_table
as prod_and_ts) prod_and_ts JOIN table2 where
prod_and_ts.user_id = table2.buyer_id
and (product_id <> table2.item_id or
timestamps <> UNIX_TIMESTAMP(table2.created_time));
I think you can do what you want with two queries, but I'm not 100% sure. Often in this situation, it is sufficient to find things in the first table that don't match in the second table. You are also trying to get a "closest" match, which is why this is challenging.
The following query looks for matches on user id and exactly one of the other two fields, and then combines them:
SELECT table2.buyer_id, table2.item_id, table2.created_time, prod_and_ts.*
from (select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as timestamps
from testingtable2 LATERAL VIEW
explode(purchased_item) exploded_table as prod_and_ts
) prod_and_ts JOIN
table2
on prod_and_ts.user_id = table2.buyer_id and
prod_and_ts.product_id = table2.item_id and
prod_and_ts.timestamps <> UNIX_TIMESTAMP(table2.created_time)
union all
SELECT table2.buyer_id, table2.item_id, table2.created_time, prod_and_ts.*
from (select user_id, prod_and_ts.product_id as product_id, prod_and_ts.timestamps as timestamps
from testingtable2 LATERAL VIEW
explode(purchased_item) exploded_table as prod_and_ts
) prod_and_ts JOIN
table2
on prod_and_ts.user_id = table2.buyer_id and
prod_and_ts.product_id <> table2.item_id and
prod_and_ts.timestamps = UNIX_TIMESTAMP(table2.created_time)
This will not find situations where there is no match on either field.
Also, I've written this using the "on" syntax rather than "where". I assume HIVE supports this.
这篇关于SQL查询与表联接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!