包装客户端code传递到RDD [英] Packaging client side code to pass to an RDD
问题描述
Scala的函数传递到 rdd.map()
。其中的逻辑是太复杂,包括函数本身之内,而不是逻辑封装斯卡拉对象
之内。的对象是其实例的火花上下文,如在下面的例子中应用程序的一部分:
A scala function is passed into rdd.map()
. The logic is too complex to be included within the function itself, and instead the logic is encapsulated within a scala object
. The object is part of the application which instantiates the Spark context, as in the following example:
def func(s.String) = {
// LogicEngine is object which, given a string, returns a different string
LogicEngine.process(s)
}
val sc = new SparkContext(config)
val rdd = sc.textFile("…")
val rdd2 = rdd.map(func)
现在的问题是,什么是要做到这一点,使 LogicEngine
本身传递给其上RDD本身正在处理(因此它的生命节点的正确方法连同传递给RDD功能code),而不是坐在客户端上?
The question is, what is the correct way to do this so that LogicEngine
is itself passed to the nodes on which the rdd itself is being processed (so that it lives together with the function code passed to the rdd), rather than sitting on the client?
感谢
推荐答案
这就是你有什么了。每个节点会实例化并使用其自身 LogicEngine
复制时,它的第一次访问。
That's what you have already. Each node will instantiate and use its own copy of LogicEngine
when it's first accessed.
这篇关于包装客户端code传递到RDD的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!