Scala-spark-corenlp-java.lang.ClassNotFoundException [英] Scala - spark-corenlp - java.lang.ClassNotFoundException
问题描述
我想运行spark-coreNLP 示例,但是我遇到了java.lang.ClassNotFoundException运行spark-submit时出错.
I want to run spark-coreNLP example, but I get an java.lang.ClassNotFoundException error when running spark-submit.
这是来自github示例的scala代码,我将其放入一个对象中,并定义了一个SparkContext.
Here is the scala code, from the github example, which I put into an object, and defined a SparkContext.
analyzer.Sentiment.scala:
analyzer.Sentiment.scala:
package analyzer
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.SparkConf
import org.apache.spark.sql.functions._
import com.databricks.spark.corenlp.functions._
import sqlContext.implicits._
object Sentiment {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("Sentiment")
val sc = new SparkContext(conf)
val input = Seq(
(1, "<xml>Stanford University is located in California. It is a great university.</xml>")
).toDF("id", "text")
val output = input
.select(cleanxml('text).as('doc))
.select(explode(ssplit('doc)).as('sen))
.select('sen, tokenize('sen).as('words), ner('sen).as('nerTags), sentiment('sen).as('sentiment))
output.show(truncate = false)
}
}
我正在使用spark-coreNLP提供的build.sbt-我只将scalaVersion和sparkVerison修改为自己的.
I am using the build.sbt provided by spark-coreNLP - I only modified the scalaVersion and sparkVerison to my own.
version := "1.0"
scalaVersion := "2.11.8"
initialize := {
val _ = initialize.value
val required = VersionNumber("1.8")
val current = VersionNumber(sys.props("java.specification.version"))
assert(VersionNumber.Strict.isCompatible(current, required), s"Java $required required.")
}
sparkVersion := "1.5.2"
// change the value below to change the directory where your zip artifact will be created
spDistDirectory := target.value
sparkComponents += "mllib"
spName := "databricks/spark-corenlp"
licenses := Seq("GPL-3.0" -> url("http://opensource.org/licenses/GPL-3.0"))
resolvers += Resolver.mavenLocal
libraryDependencies ++= Seq(
"edu.stanford.nlp" % "stanford-corenlp" % "3.6.0",
"edu.stanford.nlp" % "stanford-corenlp" % "3.6.0" classifier "models",
"com.google.protobuf" % "protobuf-java" % "2.6.1"
)
然后,我通过运行而没有问题地创建了jar.
Then, I created my jar by running without issues.
sbt package
最后,我将工作提交给Spark:
Finally, I submit my job to Spark:
spark-submit --class "analyzer.Sentiment" --master local[4] target/scala-2.11/sentimentanalizer_2.11-0.1-SNAPSHOT.jar
但是出现以下错误:
java.lang.ClassNotFoundException: analyzer.Sentiment
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:173)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:641)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
我的文件Sentiment.scala是正确的,位于名为"analyzer"的软件包中.
My file Sentiment.scala is correclty located in a package named "analyzer".
$ find .
./src
./src/analyzer
./src/analyzer/Sentiment.scala
./src/com
./src/com/databricks
./src/com/databricks/spark
./src/com/databricks/spark/corenlp
./src/com/databricks/spark/corenlp/CoreNLP.scala
./src/com/databricks/spark/corenlp/functions.scala
./src/com/databricks/spark/corenlp/StanfordCoreNLPWrapper.scala
当我从 Spark Quick运行SimpleApp示例时开始,我注意到MySimpleProject/bin/包含一个SimpleApp.class.MySentimentProject/bin为空.因此,我尝试清理我的项目(我正在使用Eclipse for Scala).
When I ran the SimpleApp example from the Spark Quick Start , I noticed that MySimpleProject/bin/ contained a SimpleApp.class. MySentimentProject/bin is empty. So I have tried to clean my project (I am using Eclipse for Scala).
我认为这是因为我需要生成Sentiment.class,但是我不知道该怎么做-它是通过SimpleApp.scala自动完成的,当它尝试使用Eclipse Scala运行/构建时,它崩溃了
I think it is because I need to generate Sentiment.class, but I don't know how to do it - It was done automatically with SimpleApp.scala, and when it ry to run/build with Eclipse Scala, it crashes.
推荐答案
也许您应该尝试添加
scalaSource in Compile := baseDirectory.value / "src"
to your build.sbt
, cause sbt document reads that "the directory that contains the main Scala sources is by default src/main/scala
".
或者仅将源代码制成这种结构
Or just make your source code in this structure
$ find .
./src
./src/main
./src/main/scala
./src/main/scala/analyzer
./src/main/scala/analyzer/Sentiment.scala
./src/main/scala/com
./src/main/scala/com/databricks
./src/main/scala/com/databricks/spark
./src/main/scala/com/databricks/spark/corenlp
./src/main/scala/com/databricks/spark/corenlp/CoreNLP.scala
./src/main/scala/com/databricks/spark/corenlp/functions.scala
./src/main/scala/com/databricks/spark/corenlp/StanfordCoreNLPWrapper.scala
这篇关于Scala-spark-corenlp-java.lang.ClassNotFoundException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!