自SparkContext使用火花外壳在使用sparkConf.set(..) [英] Customize SparkContext using sparkConf.set(..) when using spark-shell
问题描述
在星火,有3个指定的选项主要方式的 SparkConf
用于创建 SparkContext
:
In Spark, there are 3 primary ways to specify the options for the SparkConf
used to create the SparkContext
:
- 由于在conf /火花defaults.conf性能
- 例如,该行:
spark.driver.memory4克
- 例如,该行:
- As properties in the conf/spark-defaults.conf
- e.g., the line:
spark.driver.memory 4g
- e.g., the line:
- 如,
火花壳--driver-4G内存...
- e.g.,
spark-shell --driver-memory 4g ...
- 如,
sparkConf.set(spark.driver.memory,4G)
- e.g.,
sparkConf.set( "spark.driver.memory", "4g" )
然而,当使用火花壳
中,SparkContext是因为你们而你得到一个shell提示符的时候已经创建,名为变量 SC
。当使用火花外壳,你怎么使用选项#3在上面的列表中设置的配置选项,如果你有机会执行任何斯卡拉语句之前已经创建SparkContext?
However, when using spark-shell
, the SparkContext is already created for you by the time you get a shell prompt, in the variable named sc
. When using spark-shell, how do you use option #3 in the list above to set configuration options, if the SparkContext is already created before you have a chance to execute any Scala statements?
在我特别想使用凯洛序列化和GraphX。使用KRYO与GraphX的prescribed方法是执行自定义 SparkConf
实例时,下面的斯卡拉语句:
In particular, I am trying to use Kyro serialization and GraphX. The prescribed way to use Kryo with GraphX is to execute the following Scala statement when customizing the SparkConf
instance:
GraphXUtils.registerKryoClasses( sparkConf )
如何运行的,当我做到这一点火花壳
?
推荐答案
星火2.0 +
您应该能够使用 SparkSession.conf.set
方法来设置运行时配置选项。
You should be able to use SparkSession.conf.set
method to set configuration option on runtime.
星火< 2.0
您可以简单地停止现有的上下文并创建一个新的:
You can simply stop an existing context and create a new one:
import org.apache.spark.{SparkContext, SparkConf}
sc.stop()
val conf = new SparkConf().set("spark.executor.memory", "4g")
val sc = new SparkContext(conf)
你可以在官方文档
一旦SparkConf对象被传递到火花,它被克隆,并可以不再由用户修改。星火不支持在运行时修改配置。
once a SparkConf object is passed to Spark, it is cloned and can no longer be modified by the user. Spark does not support modifying the configuration at runtime.
因此,大家可以看到停止的背景是,一旦外壳已经开始了只适用的选择。
So as you can see stopping the context it is the only applicable option once shell has been started.
您可以随时使用的配置文件或 - 设置
参数火花壳
来设置必要的参数这将是默认的上下文中使用。在KRYO的情况下,你应该看看:
You can always use configuration files or --conf
argument to spark-shell
to set required parameters which will be used be the default context. In case of Kryo you should take a look at:
-
spark.kryo.classesToRegister
-
spark.kryo.registrator
spark.kryo.classesToRegister
spark.kryo.registrator
这篇关于自SparkContext使用火花外壳在使用sparkConf.set(..)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!