如何在Spark 2中启用钨优化? [英] How to enable Tungsten optimization in Spark 2?

查看:74
本文介绍了如何在Spark 2中启用钨优化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚在hive支持下构建了Spark 2,并使用Hortonworks 2.3.4将其部署到集群中.但是,我发现此Spark 2.0.3比HDP 2.3随附的标准spark 1.5.3慢.

I just built Spark 2 with hive support and deploy it to a cluster with Hortonworks 2.3.4. However I find that this Spark 2.0.3 is slower than the standard spark 1.5.3 that comes with HDP 2.3

当我检查explain时,似乎我的Spark 2.0.3没有使用钨.我需要创建特殊版本来启用钨吗?

When I check explain it seems that my Spark 2.0.3 is not using tungsten. Do I need to create special build to enable Tungsten?

火花1.5.3解释

== Physical Plan ==
TungstenAggregate(key=[id#2], functions=[], output=[id#2])
TungstenExchange hashpartitioning(id#2)
TungstenAggregate(key=[id#2], functions=[], output=[id#2])
HiveTableScan [id#2], (MetastoreRelation default, testing, None)

火花2.0.3

== Physical Plan ==
*HashAggregate(keys=[id#2481], functions=[])
  +- Exchange hashpartitioning(id#2481, 72)
  +- *HashAggregate(keys=[id#2481], functions=[])
  +- HiveTableScan [id#2481], MetastoreRelation default, testing

推荐答案

它仍然使用钨,重命名了该类:

It still uses Tungsten, class was renamed: https://github.com/apache/spark/commit/8900c8d8ff1614b5ec5a2ce213832fa13462b4d4

这篇关于如何在Spark 2中启用钨优化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆