火花提交错误:ClassNotFoundException [英] spark-submit error: ClassNotFoundException

查看:67
本文介绍了火花提交错误:ClassNotFoundException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

build.sbt

build.sbt

lazy val commonSettings = Seq(
    organization := "com.me",
    version := "0.1.0",
    scalaVersion := "2.11.0"
)

lazy val counter = (project in file("counter")).
    settings(commonSettings:_*)

counter/build.sbt

counter/build.sbt

name := "counter"
mainClass := Some("Counter")
scalaVersion := "2.11.0"

val sparkVersion = "2.1.1";

libraryDependencies += "org.apache.spark" %% "spark-core" % sparkVersion % "provided";
libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion % "provided";
libraryDependencies += "org.apache.spark" %% "spark-streaming" % sparkVersion % "provided";

libraryDependencies += "com.datastax.spark" %% "spark-cassandra-connector" % "2.0.2";
libraryDependencies += "org.apache.spark" %% "spark-streaming-kafka-0-8" % sparkVersion;

libraryDependencies += "com.github.scopt" %% "scopt" % "3.5.0";

libraryDependencies += "org.scalactic" %% "scalactic" % "3.0.1";
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1" % "test";

mergeStrategy in assembly := {
  case PathList("org", "apache", "spark", "unused", "UnusedStubClass.class") => MergeStrategy.first
  case x => (mergeStrategy in assembly).value(x)
}

counter.scala:

counter.scala:

object Counter extends SignalHandler
{
    var ssc : Option[StreamingContext] = None;
    def main( args: Array[String])

运行

./spark-submit --class "Counter" --master spark://10.1.204.67:6066 --deploy-mode cluster file://counter-assembly-0.1.0.jar

错误:

17/06/21 19:00:25 INFO Utils: Successfully started service 'Driver' on port 50140.
17/06/21 19:00:25 INFO WorkerWatcher: Connecting to worker spark://Worker@10.1.204.57:52476
Exception in thread "main" java.lang.ClassNotFoundException: Counter
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at org.apache.spark.util.Utils$.classForName(Utils.scala:229)
    at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:56)
    at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)

有什么主意吗?谢谢

更新

我在这里遇到问题无法提交本地jar到Spark集群:java.nio.file.NoSuchFileException .现在,我将罐子复制到 spark-2.1.0-bin-hadoop2.7/bin 中,然后运行./spark-submit --class"Counter" --master spark://10.1.204.67:6066-部署模式群集文件://Counter-assembly-0.1.0.jar

I had the problem here Failed to submit local jar to spark cluster: java.nio.file.NoSuchFileException. Now, I copied the jar into spark-2.1.0-bin-hadoop2.7/bin and then run ./spark-submit --class "Counter" --master spark://10.1.204.67:6066 --deploy-mode cluster file://Counter-assembly-0.1.0.jar

spark集群为2.1.0

The spark cluster is of 2.1.0

但是罐子是在2.1.1和Scala 2.11.0中组装的.

But the jar was assembled in 2.1.1 and Scala 2.11.0.

推荐答案

似乎您刚刚开始使用Scala开发Spark应用程序,因此仅是为了帮助您和其他将来的Spark开发人员,我希望给您足够的步骤来适应环境.

It appears that you've just started developing Spark applications with Scala so for the only purpose to help you and the other future Spark developers, I hope to give you enough steps to get going with the environment.

似乎使用了多项目sbt构建,这就是为什么有两个 build.sbt 的原因.为了解决您的问题,我假装您不使用此高级sbt设置.

It appears that you use multi-project sbt build and that's why you have two build.sbts. For the purpose of fixing your issue I'd pretend you don't use this advanced sbt setup.

似乎使用了Spark Streaming,因此将其定义为依赖项(如 libraryDependencies ).您不必定义其他Spark依赖项(例如 spark-core spark-sql ).

It appears that you use Spark Streaming so define it as a dependency (as libraryDependencies). You don't have to define the other Spark dependencies (like spark-core or spark-sql).

您应该具有 build.sbt ,如下所示:

You should have build.sbt as follows:

organization := "com.me"
version := "0.1.0"
scalaVersion := "2.11.0"
val sparkVersion = "2.1.1"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % sparkVersion % "provided"

构建可部署软件包

使用上面的 build.sbt ,您执行 sbt包来构建可部署的Spark应用程序包,最终将其 spark-submit 提交给Spark.集群.

Building Deployable Package

With build.sbt above, you execute sbt package to build a deployable Spark application package that you eventually spark-submit to a Spark cluster.

您不必为此使用 sbt程序集....我可以看到您使用了Spark Cassandra Connector和其他依赖项,这些依赖项也可以使用-packages -jars 来定义(它们本身各有利弊)

You don't have to use sbt assembly for that...yet. I can see that you use Spark Cassandra Connector and other dependencies that could also be defined using --packages or --jars instead (which by themselves have their pros and cons).

sbt package

最终 target/scala-2.11/counter_2.11-0.1.0.jar 的大小将比 counter-assembly-0.1.0.jar 使用 sbt程序集构建的,因为 sbt软件包在单个jar文件中不包含依赖项.那是预期的,很好.

The size of the final target/scala-2.11/counter_2.11-0.1.0.jar is going to be much smaller than counter-assembly-0.1.0.jar you have built using sbt assembly because sbt package does not include the dependencies in a single jar file. That's expected and fine.

sbt软件包之后,您应该在 target/scala-2.11 中将可部署的软件包作为 counter-assembly-0.1.0.jar .

After sbt package you should have the deployable package in target/scala-2.11 as counter-assembly-0.1.0.jar.

您只需使用必要的选项 spark-submit ,在您的情况下为:

You should just spark-submit with required options which in your case would be:

spark-submit \
  --master spark://10.1.204.67:6066
 target/scala-2.11/counter-assembly-0.1.0.jar

就是这样.

请注意:

  1. -部署模式集群对于练习而言太高级了(让我们保持简单,并在需要时将其放回原处)

  1. --deploy-mode cluster is too advanced for the exercise (let's keep it simple and bring it back when needed)

file://使事情坏了(或至少是多余的)

file:// makes things broken (or at least is superfluous)

-类"Counter" 将由 sbt包处理.您可以放心地跳过它.

--class "Counter" is taken care of by sbt package when you have a single Scala application in a project where you execute it. You can safely skip it.

这篇关于火花提交错误:ClassNotFoundException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆