为什么“无法在已停止的SparkContext上调用方法"从Java应用程序连接到Spark Standalone时抛出异常? [英] Why is "Cannot call methods on a stopped SparkContext" thrown when connecting to Spark Standalone from Java application?

查看:499
本文介绍了为什么“无法在已停止的SparkContext上调用方法"从Java应用程序连接到Spark Standalone时抛出异常?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经下载了为Hadoop 2.6及更高版本预构建的Apache Spark 1.4.1.我有两台Ubuntu 14.04机器.我已将其中一个设置为具有单个从属设备的Spark主设备,第二台机器正在运行一个Spark从属设备.当我执行./sbin/start-all.sh命令时,主机和从机成功启动.之后,我在spark-shell中运行示例PI程序,将--master spark://192.168.0.105:7077设置为Spark Web UI中显示的Spark主URL.

I have downloaded Apache Spark 1.4.1 pre-built for Hadoop 2.6 and later. I have two Ubuntu 14.04 machines. One of them I have set as the Spark master with a single slave and the second machine is running one Spark slave. When I execute the ./sbin/start-all.sh command the master and the slaves are started successfully. After that I ran the sample PI program in the spark-shell setting the --master spark://192.168.0.105:7077 to the Spark master URL displayed in the Spark web UI.

到目前为止,一切都很好.

So far everything works great.

我创建了一个Java应用程序,并尝试将其配置为在需要时运行Spark作业.我在pom.xml文件中添加了spark依赖项.

I have created a Java application and I tried to configure it to run Spark jobs when needed. I added the spark dependencies in the pom.xml file.

        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_2.11</artifactId>
            <version>1.4.1</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-streaming_2.11</artifactId>
            <version>1.4.1</version>
        </dependency>

我创建了SparkConfig:

private parkConf sparkConfig = new SparkConf(true)
            .setAppName("Spark Worker")
            .setMaster("spark://192.168.0.105:7077");

然后使用SparkConfig创建一个SparkContext:

private SparkContext sparkContext = new SparkContext(sparkConfig);

在此步骤上,将引发以下错误:

On this step the following error is thrown:

java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext
    at org.apache.spark.SparkContext.org$apache$spark$SparkContext$$assertNotStopped(SparkContext.scala:103)
    at org.apache.spark.SparkContext.getSchedulingMode(SparkContext.scala:1503)
    at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:2007)
    at org.apache.spark.SparkContext.<init>(SparkContext.scala:543)
    at com.storakle.dataimport.spark.StorakleSparkConfig.getSparkContext(StorakleSparkConfig.java:37)
    at com.storakle.dataimport.reportprocessing.DidNotBuyProductReport.prepareReportData(DidNotBuyProductReport.java:25)
    at com.storakle.dataimport.messagebroker.RabbitMQMessageBroker$1.handleDelivery(RabbitMQMessageBroker.java:56)
    at com.rabbitmq.client.impl.ConsumerDispatcher$5.run(ConsumerDispatcher.java:144)
    at com.rabbitmq.client.impl.ConsumerWorkService$WorkPoolRunnable.run(ConsumerWorkService.java:99)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

如果我将Spark master更改为local,则一切正常.

If I change the Spark master to local everything works just fine.

private parkConf sparkConfig = new SparkConf(true)
                .setAppName("Spark Worker")
                .setMaster("local");

我正在托管Spark Master的同一台计算机上运行Java应用程序.

I am running the Java app on the same machine that hosts the Spark Master.

我不知道为什么会这样?到目前为止,我发现的每个文档和示例都表明该代码应与Spark Master URL配合使用.

I have no idea why this is happening? Every documentation and example that I've found so far are indicating that the code should work with the Spark Master URL.

有什么想法为什么会发生以及如何解决?我花了很多时间试图弄清楚这一点,到目前为止还没有运气.

Any ideas why this is happening and how I can fix it? I have spent a lot of time trying to figure this one out and with no luck so far.

推荐答案

我认为您将Spark 1.4.1用于Scala 2.10.因此,您需要spark-core_2.10spark-streaming_2.10而不是2.11. spark-core_2.11与为Scala 2.10构建的Spark不兼容.

I think you use Spark 1.4.1 for Scala 2.10. Therefore, you need spark-core_2.10 and spark-streaming_2.10 instead 2.11. spark-core_2.11 incompatible with Spark built for Scala 2.10.

有关为Scala 2.11构建Spark的信息,请参见:

For building Spark for Scala 2.11 see:

http://spark.apache. org/docs/latest/building-spark.html#building-for-scala-211

这篇关于为什么“无法在已停止的SparkContext上调用方法"从Java应用程序连接到Spark Standalone时抛出异常?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆