相当于SKEW连接提示的Spark Scala [英] Spark Scala equivalent for SKEW join hints
问题描述
Spark SQL有一个倾斜提示(请参阅此处). Spark Scala是否有等效的提示可用?
Spark SQL has a skew hint available (please see here). Is there an equivalent hint available for Spark Scala?
示例 这是Spark SQL代码,其中事实表的ProductId列倾斜:
Example This is the Spark SQL code where fact table has skewed ProductId column:
SELECT /*+ SKEW('viewFact', 'ProductId') */
RevSumDivisionName, RevSumCategoryName, CloudAddOnFlag,
SUM(ActualRevenueAmt) AS RevenueUSD, COUNT(*) AS Cnt
FROM viewFact
INNER JOIN viewPMST ON viewFact.ProductId = viewPMST.ProductId
INNER JOIN viewRsDf ON viewPMST.ProductFamilyId = viewRsDf.ProductFamilyId
INNER JOIN viewRevH ON viewRsDf.RevSumCategoryId = viewRevH.RevSumCategoryId
GROUP BY RevSumDivisionName, RevSumCategoryName, CloudAddOnFlag
Scala中的相同加入:
Same join in Scala:
inFact
.join(inPMst, Seq("ProductId"))
.join(inRsDf, Seq("ProductFamilyId"))
.join(inRevH, Seq("RevSumCategoryId"))
.groupBy($"RevSumDivisionName", $"RevSumCategoryName", $"CloudAddOnFlag")
.agg(sum($"ActualRevenueAmt") as "RevenueUSD", count($"*") as "Cnt")
我只是无法找到偏斜提示的语法.
I'm just unable finding syntax for the skew hint.
推荐答案
Spark SQL具有倾斜提示
Spark SQL has a skew hint available
不是. Databricks平台具有此功能,但它是Spark本身无法提供的专有扩展(与索引相同).
It does not. Databricks platform has, but it is a proprietary extension (same as indexing) not available in Spark as such.
我只是无法找到偏斜提示的语法.
I'm just unable finding syntax for the skew hint.
通常情况下,查询计划提示是通过hint
方法传递的,该方法可以像这样使用
In general case query plan hints are passed using hint
method which can be used like this
val hint: String = ???
inFact.join(inPMst.hint(hint), Seq("ProductId")))
这篇关于相当于SKEW连接提示的Spark Scala的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!