相当于SKEW连接提示的Spark Scala [英] Spark Scala equivalent for SKEW join hints

查看:105
本文介绍了相当于SKEW连接提示的Spark Scala的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Spark SQL有一个倾斜提示(请参阅此处). Spark Scala是否有等效的提示可用?

Spark SQL has a skew hint available (please see here). Is there an equivalent hint available for Spark Scala?

示例 这是Spark SQL代码,其中事实表的ProductId列倾斜:

Example This is the Spark SQL code where fact table has skewed ProductId column:

SELECT /*+ SKEW('viewFact', 'ProductId') */
    RevSumDivisionName, RevSumCategoryName, CloudAddOnFlag,
    SUM(ActualRevenueAmt) AS RevenueUSD, COUNT(*) AS Cnt
FROM viewFact
INNER JOIN viewPMST ON viewFact.ProductId = viewPMST.ProductId
INNER JOIN viewRsDf ON viewPMST.ProductFamilyId = viewRsDf.ProductFamilyId
INNER JOIN viewRevH ON viewRsDf.RevSumCategoryId = viewRevH.RevSumCategoryId
GROUP BY RevSumDivisionName, RevSumCategoryName, CloudAddOnFlag

Scala中的相同加入:

Same join in Scala:

inFact
   .join(inPMst, Seq("ProductId"))
   .join(inRsDf, Seq("ProductFamilyId"))
   .join(inRevH, Seq("RevSumCategoryId"))
.groupBy($"RevSumDivisionName", $"RevSumCategoryName", $"CloudAddOnFlag")
.agg(sum($"ActualRevenueAmt") as "RevenueUSD", count($"*") as "Cnt")

我只是无法找到偏斜提示的语法.

I'm just unable finding syntax for the skew hint.

推荐答案

Spark SQL具有倾斜提示

Spark SQL has a skew hint available

不是. Databricks平台具有此功能,但它是Spark本身无法提供的专有扩展(与索引相同).

It does not. Databricks platform has, but it is a proprietary extension (same as indexing) not available in Spark as such.

我只是无法找到偏斜提示的语法.

I'm just unable finding syntax for the skew hint.

通常情况下,查询计划提示是通过hint方法传递的,该方法可以像这样使用

In general case query plan hints are passed using hint method which can be used like this

val hint: String = ???
inFact.join(inPMst.hint(hint),  Seq("ProductId")))

这篇关于相当于SKEW连接提示的Spark Scala的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆