在Windows机器上的火花斯卡拉 [英] spark scala on windows machine

查看：287 发布时间：2016/5/22 16:15:45 scala apache-spark

本文介绍了在Windows机器上的火花斯卡拉的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我从<学习型href=\"https://www.udemy.com/big-data-analytics-with-apache-spark-and-scala/learn/v4/t/lecture/4729300\"相对=nofollow>类。我已经运行了code，如图中的类，我也得到下面的错误。任何想法，我应该怎么办？

我有火花1.6.1和Scala版本2.10.5（Java的热点（TM）64位服务器VM，爪哇1.8.0_74）

  VAL DATADIR =C：/个人/ V2Maestros /课程/星火大数据分析/斯卡拉// ................................................ ............................
////建立和保存模型
// ................................................ ............................VAL tweetData = sc.textFile（DATADIR +/movietweets.csv）
tweetData.collect（）高清convertToRDD（INSTR：字符串）：（双人间，字符串）= {
    VAL ATTLIST = inStr.split（，）
    VAL情绪= ATTLIST（0）。载有（阳性）{匹配
            真实的情况下= GT; 0.0
            区分虚假=＆GT; 1.0
     }
    回报（情绪，ATTLIST（1））
}
VAL tweetText = tweetData.map（convertToRDD）
tweetText.collect（）// VAL sqlContext =新org.apache.spark.sql.SQLContext（SC）
进口sqlContext.implicits._
VAR ttDF = sqlContext.createDataFrame（tweetText）.toDF（标签，文本）
ttDF.show（）

该错误是：

 斯卡拉＆GT; ttDF.show（）
[第2阶段：＆GT; （0 + 2）/ 2] 16/03/30 11点40分二十五秒错误ExecutorClassLoader：无法检查类org.apache.spark.sql.catalyst.ex pressio存在
在http://192.168.56.1:54595 REPL级服务器
java.net.ConnectException：连接超时：连接
        在java.net.TwoStacksPlainSocketImpl.socketConnect（本机方法）
       RE / 4729300

解决方案

我不是专家，但在错误信息的连接IP看起来像一个私人节点甚至是你的路由器/调制解调器本地地址。

正如评论说，它可能是你正在运行与试图向S $ P $垫的工作，这是不存在的，而不是在本地JVM进程中的群集配置错误的上下文。

有关更多信息，你可以阅读这里和实验的东西，如

 进口org.apache.spark.SparkContextVAL SC =新SparkContext（主=本地[4]的appName =tweetsClass的conf =新SparkConf）

更新

由于您使用的交互式shell和所提供的 SparkContext 可在那里，我想你应该通过等效参数的shell命令在

 ＆lt;您的火花路径＆GT; /斌/火花壳--master本地[4]

这指示驱动程序分配一个主本地机器上的火花集群，4个线程。

I am learning from the class. I have run the code as shown in the class and i get below errors. Any idea what i should do?

I have spark 1.6.1 and Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_74)

val datadir = "C:/Personal/V2Maestros/Courses/Big Data Analytics with Spark/Scala"

//............................................................................
////   Building and saving the model
//............................................................................

val tweetData = sc.textFile(datadir + "/movietweets.csv")
tweetData.collect()

def convertToRDD(inStr : String) : (Double,String) = {
    val attList = inStr.split(",")
    val sentiment = attList(0).contains("positive") match {
            case  true => 0.0
            case  false    => 1.0
     }
    return (sentiment, attList(1))
}
val tweetText=tweetData.map(convertToRDD)
tweetText.collect()

//val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.implicits._
var ttDF = sqlContext.createDataFrame(tweetText).toDF("label","text")
ttDF.show()

The error is:

scala> ttDF.show()
[Stage 2:>                                                          (0 + 2) / 2]16/03/30 11:40:25 ERROR ExecutorClassLoader: Failed to check existence of class org.apache.spark.sql.catalyst.expressio
REPL class server at http://192.168.56.1:54595
java.net.ConnectException: Connection timed out: connect
        at java.net.TwoStacksPlainSocketImpl.socketConnect(Native Method)
       re/4729300

解决方案

I'm no expert but the connection IP in the error message looks like a private node or even your router/modem local address.

As stated in the comment it could be that you're running the context with a wrong configuration that tries to spread the work to a cluster that's not there, instead of in your local jvm process.

For further information you can read here and experiment with something like

import org.apache.spark.SparkContext

val sc = new SparkContext(master = "local[4]", appName = "tweetsClass", conf = new SparkConf)

Update

Since you're using the interactive shell and the provided SparkContext available there, I guess you should pass the equivalent parameters to the shell command as in

<your-spark-path>/bin/spark-shell --master local[4]

Which instructs the driver to assign a master for the spark cluster on the local machine, on 4 threads.

这篇关于在Windows机器上的火花斯卡拉的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Windows机器上的火花斯卡拉 [英] spark scala on windows machine

问题描述

更新

Update

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在Windows机器上的火花斯卡拉 [英] spark scala on windows machine

问题描述

更新

Update

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭