在Bash脚本中执行Apache Spark(Scala)代码 [英] Execute Apache Spark (Scala) code in Bash script

查看:94
本文介绍了在Bash脚本中执行Apache Spark(Scala)代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是火花和scala的新手.我想从bash脚本中执行一些spark代码.我写了下面的代码.

I am newbie to spark and scala. I wanted to execute some spark code from inside a bash script. I wrote the following code.

Scala代码被写入一个单独的 .scala 文件中,如下所示.

Scala code was written in a separate .scala file as follows.

Scala代码:

import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf

object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Simple Application")
    val sc = new SparkContext(conf)
    println("x="+args(0),"y="+args(1))
  }
}

这是bash脚本,用于调用Apache-spark/scala代码.

This is the bash script that invokes the Apache-spark/scala code.

现金代码

#!/usr/bin/env bash
Absize=File_size1
AdBsize=File_size2
for i in `seq 2 $ABsize`
do
    for j in `seq 2 $ADsize`
    do
        Abi=`sed -n ""$i"p" < File_Path1`
        Adj=`sed -n ""$j"p" < File_Path2`
        scala SimpleApp.scala $Abi $adj
    done
done

但是随后出现以下错误.

But then I get the following errors.

错误:

error:object apache is not a member of package org
import org.apache.spark.SparkContext
          ^
error: object apache is not a member of package org
import org.apache.spark.SparkContext._
           ^
error: object apache is not a member of package org
import org.apache.spark.SparkConf
           ^
error: not found:type SparkConf
val conf = new SparkConf().setAppName("Simple Application")              ^
 error: not found:type SparkContext

如果编写的scala文件没有任何spark功能(即纯scala文件),则上面的代码可以完美地工作,但是当有apache-spark导入时,上述代码将失败.

The above code works perfectly if the scala file is written without any spark function (That is a pure scala file), but fails when there are apache-spark imports.

从bash脚本运行和执行此操作的好方法是什么?我需要调用spark shell来执行代码吗?

What would be a good way to run and execute this from bash script? Will I have to call spark shell to execute the code?

推荐答案

使用环境变量设置spark并按照 spark-submit -class SimpleApp simple-project_2.11-1.0.jar $ Abi的要求@puhlen运行$ adj

set up spark with environment variable and run as @puhlen told with spark-submit -class SimpleApp simple-project_2.11-1.0.jar $Abi $adj

这篇关于在Bash脚本中执行Apache Spark(Scala)代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆