Spark SQL 广播提示中间表 [英] Spark SQL broadcast hint intermediate tables

查看:27
本文介绍了Spark SQL 广播提示中间表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在使用广播提示时遇到问题(可能是缺乏 SQL 知识).

我有一个类似的查询

SELECT */* 广播(a) */从一个内连接 b在 ....内连接 c在 ....

我想做

SELECT */* 广播(a) */从一个内连接 b在 ....内部连接 ​​c/* 广播(AjoinedwithB)*/在 ....

我的意思是,我想强制广播加入(我宁愿避免更改火花参数以在任何地方强制它),但我不知道如何引用名为 AjoinedwithB 的表>

当然我可以拆分 SQL,使用 DF API 等等......但我想在单个 SQL 查询中完成.

解决方案

您可以使用任一子查询

SELECT/*+ broadcast(a_b) */*从(SELECT/*+ broadcast(a) */* FROM a JOIN b ON ...) AS a_b加入...

或 CTE:

WITH a_b AS (SELECT/*+ broadcast(a) */* FROM a JOIN b ON ...)SELECT/*+ broadcast(a_b) */* FROM a_b JOIN c ON ...

I have a problem using Broadcast hints (maybe is some lack of SQL knowledge).

I have a query like

SELECT * /* broadcast(a) */
FROM a 
INNER JOIN b
ON ....
INNER JOIN c
on ....

I would like to do

SELECT * /* broadcast(a) */
FROM a 
INNER JOIN b 
ON ....
INNER JOIN c /* broadcast(AjoinedwithB) */
on ....

I mean, I want to force broadcast join (I would prefer to avoid changing spark parameters to force it everywhere), but I don't know how to refer to the table named AjoinedwithB

Of course I can split the SQL, work with DF API and such... but I would like to do it in a single SQL Query.

解决方案

You can use either subquery

SELECT /*+ broadcast(a_b) */ *
FROM 
    (SELECT /*+ broadcast(a) */ * FROM a JOIN b ON ...) AS a_b 
    JOIN c ON ...

or CTE:

WITH a_b AS (SELECT /*+ broadcast(a) */ * FROM a JOIN b ON ...)
SELECT /*+ broadcast(a_b) */ * FROM a_b JOIN c ON ...

这篇关于Spark SQL 广播提示中间表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆