在Eclipse上本地运行Spark代码,并在远程服务器上安装Spark [英] Running spark code locally on eclipse with spark installed on remote server

查看:282
本文介绍了在Eclipse上本地运行Spark代码,并在远程服务器上安装Spark的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经为scala配置了eclipse,并创建了一个maven项目,并在Windows上编写了一个简单的字数统计工作。现在,我的spark + hadoop已安装在linux服务器上。如何将我的Spark代码从Eclipse启动到Spark集群(在Linux上)?

I have configured eclipse for scala and created a maven project and wrote a simple word count spark job on windows. Now my spark+hadoop are installed on linux server. How can I launch my spark code from eclipse to spark cluster (which is on linux)?

任何建议。

推荐答案

实际上,这个答案并不像您期望的那么简单。

Actually this answer is not so simple, as you would expect.

我会做很多假设,首先您要使用 sbt ,第二个是您正在基于Linux的计算机上工作,第三个是最后一个是您有两个在您的项目中,假设 RunMe Globals ,最后一个假设是您要设置设置在程序内部。因此,在您的可运行代码中的某处,您必须具有以下内容:

I will make many assumptions, first that you use sbt, second is that you are working in a linux based computer, third is the last is that you have two classes in your project, let's say RunMe and Globals, and the last assumption will be that you want to set up the settings inside the program. Thus, somewhere in your runnable code you must have something like this:

object RunMe {
  def main(args: Array[String]) {
    val conf = new SparkConf()
      .setMaster("mesos://master:5050") //If you use Mesos, and if your network resolves the hostname master to its IP.
      .setAppName("my-app")
      .set("spark.executor.memory", "10g")
    val sc = new SparkContext(conf)
    val sqlContext = new SQLContext()

    //your code comes here
  }
}

必须遵循的步骤是:


  • 在项目的根目录中编译项目,通过使用:

  • Compile the project, in the root of it, by using:

$ sbt程序集

将作业发送到主节点,这是有趣的部分(假设您在项目 target / scala / 中具有下一个结构,并且在内部具有文件 .jar ,它对应于已编译的项目)

Send the job to the master node, this is the interesting part (assuming you have the next structure in your project target/scala/, and inside you have a file .jar, which corresponds to the compiled project)

$ spark-submit --class RunMe target / scala / app.jar

请注意,因为我认为项目有两个或多个类,您将必须确定要运行的类。此外,我敢打赌,对于 Yarn Mesos 来说,这两种方法都非常相似。

Notice that, because I assumed that the project has two or more classes you would have to identify which class you want to run. Furthermore, I bet that both approaches, for Yarn and Mesos are very similar.

这篇关于在Eclipse上本地运行Spark代码,并在远程服务器上安装Spark的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆