在Bash脚本中执行Apache Spark(Scala)代码 [英] Execute Apache Spark (Scala) code in Bash script
问题描述
我是火花和scala的新手.我想从bash脚本中执行一些spark代码.我写了下面的代码.
I am newbie to spark and scala. I wanted to execute some spark code from inside a bash script. I wrote the following code.
Scala代码被写入一个单独的 .scala
文件中,如下所示.
Scala code was written in a separate .scala
file as follows.
Scala代码:
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
object SimpleApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
println("x="+args(0),"y="+args(1))
}
}
这是bash脚本,用于调用Apache-spark/scala代码.
This is the bash script that invokes the Apache-spark/scala code.
现金代码
#!/usr/bin/env bash
Absize=File_size1
AdBsize=File_size2
for i in `seq 2 $ABsize`
do
for j in `seq 2 $ADsize`
do
Abi=`sed -n ""$i"p" < File_Path1`
Adj=`sed -n ""$j"p" < File_Path2`
scala SimpleApp.scala $Abi $adj
done
done
但是随后出现以下错误.
But then I get the following errors.
错误:
error:object apache is not a member of package org
import org.apache.spark.SparkContext
^
error: object apache is not a member of package org
import org.apache.spark.SparkContext._
^
error: object apache is not a member of package org
import org.apache.spark.SparkConf
^
error: not found:type SparkConf
val conf = new SparkConf().setAppName("Simple Application") ^
error: not found:type SparkContext
如果编写的scala文件没有任何spark功能(即纯scala文件),则上面的代码可以完美地工作,但是当有apache-spark导入时,上述代码将失败.
The above code works perfectly if the scala file is written without any spark function (That is a pure scala file), but fails when there are apache-spark imports.
从bash脚本运行和执行此操作的好方法是什么?我需要调用spark shell来执行代码吗?
What would be a good way to run and execute this from bash script? Will I have to call spark shell to execute the code?
推荐答案
使用环境变量设置spark并按照 spark-submit -class SimpleApp simple-project_2.11-1.0.jar $ Abi的要求@puhlen运行$ adj
set up spark with environment variable and run as @puhlen told with spark-submit -class SimpleApp simple-project_2.11-1.0.jar $Abi $adj
这篇关于在Bash脚本中执行Apache Spark(Scala)代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!