Spark SQL 广播提示中间表 [英] Spark SQL broadcast hint intermediate tables
本文介绍了Spark SQL 广播提示中间表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在使用广播提示时遇到问题(可能是缺乏 SQL 知识).
我有一个类似的查询
SELECT */* 广播(a) */从一个内连接 b在 ....内连接 c在 ....
我想做
SELECT */* 广播(a) */从一个内连接 b在 ....内部连接 c/* 广播(AjoinedwithB)*/在 ....
我的意思是,我想强制广播加入(我宁愿避免更改火花参数以在任何地方强制它),但我不知道如何引用名为 AjoinedwithB
的表>
当然我可以拆分 SQL,使用 DF API 等等......但我想在单个 SQL 查询中完成.
解决方案
您可以使用任一子查询
SELECT/*+ broadcast(a_b) */*从(SELECT/*+ broadcast(a) */* FROM a JOIN b ON ...) AS a_b加入...
或 CTE:
WITH a_b AS (SELECT/*+ broadcast(a) */* FROM a JOIN b ON ...)SELECT/*+ broadcast(a_b) */* FROM a_b JOIN c ON ...
I have a problem using Broadcast hints (maybe is some lack of SQL knowledge).
I have a query like
SELECT * /* broadcast(a) */
FROM a
INNER JOIN b
ON ....
INNER JOIN c
on ....
I would like to do
SELECT * /* broadcast(a) */
FROM a
INNER JOIN b
ON ....
INNER JOIN c /* broadcast(AjoinedwithB) */
on ....
I mean, I want to force broadcast join (I would prefer to avoid changing spark parameters to force it everywhere), but I don't know how to refer to the table named AjoinedwithB
Of course I can split the SQL, work with DF API and such... but I would like to do it in a single SQL Query.
解决方案
You can use either subquery
SELECT /*+ broadcast(a_b) */ *
FROM
(SELECT /*+ broadcast(a) */ * FROM a JOIN b ON ...) AS a_b
JOIN c ON ...
or CTE:
WITH a_b AS (SELECT /*+ broadcast(a) */ * FROM a JOIN b ON ...)
SELECT /*+ broadcast(a_b) */ * FROM a_b JOIN c ON ...
这篇关于Spark SQL 广播提示中间表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文