壳牌火花上侦听本地主机,而不是配置的IP地址 [英] Spark Shell Listens on localhost instead of configured IP address

查看:185
本文介绍了壳牌火花上侦听本地主机,而不是配置的IP地址的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图运行通过火花外壳简单的火花的工作,它看起来像
BlockManager的火花壳监听本地主机配置的IP,而不是
地址导致火花作业失败。引发的异常是无法连接到本地主机

I am trying to run a simple spark job via spark-shell and it looks like BlockManager for the spark-shell listens on localhost instead of configured IP address which causes the spark job to fail. The exception thrown is "Failed to connect to localhost" .

下面是我的配置:

机1(ubunt64):星火主[192.168.253.136]

机器2(ubuntu64server):星火从[192.168.253.137]

机器3(ubuntu64server2):星火壳客户端[192.168.253.138]

星火版本:火花1.3.0彬hadoop2.4
环境:的Ubuntu 14.04

在星火壳牌执行源$ C ​​$ C:

    import org.apache.spark.SparkConf
    import org.apache.spark.SparkContext

    var conf = new SparkConf().setMaster("spark://192.168.253.136:7077")
    conf.set("spark.driver.host","192.168.253.138")
    conf.set("spark.local.ip","192.168.253.138")
    sc.stop
    var sc = new SparkContext(conf)
    val textFile = sc.textFile("README.md")
    textFile.count()

以上code只是工作的文件,如果我上机2运行,其中从
运行,但上机1(主)和机器3(火花壳牌)。

The above code just works file if I run it on Machine 2 where the slave is running, but it fails on Machine 1 (Master) and Machine 3(Spark Shell).

不知道为什么火花外壳上的本地主机,而不是听
配置的IP地址。我一直在使用spark-env.sh在.bashrc中设置SPARK_LOCAL_IP机器3以及(出口SPARK_LOCAL_IP = 192.168.253.138)。我证实,火花外壳java程序并侦听端口44015.不知道为什么火花外壳广播本地主机地址。

Not sure why spark shell listens on a localhost instead of configured IP address. I have set SPARK_LOCAL_IP on Machine 3 using spark-env.sh as well in .bashrc (export SPARK_LOCAL_IP=192.168.253.138). I confirmed that spark shell java program does listen on the port 44015. Not sure why spark shell is broadcasting localhost address.

任何帮助来解决这个问题将是非常美联社preciated。也许我
缺少一些配置设置。

Any help to troubleshoot this issue will be highly appreciated. Probably I am missing some configuration setting.

日志:

斯卡拉> VAL TEXTFILE = sc.textFile(README.md)

15/04/22十八时15分22秒INFO MemoryStore的:ensureFreeSpace(163705)调用curMem = 0,MAXMEM = 280248975

15/04/22 18:15:22 INFO MemoryStore: ensureFreeSpace(163705) called with curMem=0, maxMem=280248975

15/04/22十八时15分22秒INFO MemoryStore的:阻止broadcast_0存储在内存中的值(估计大小159.9 KB,免费267.1 MB)

15/04/22 18:15:22 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 159.9 KB, free 267.1 MB)

15/04/22十八时15分22秒INFO MemoryStore的:ensureFreeSpace(22692)调用curMem = 163705,MAXMEM = 280248975

15/04/22 18:15:22 INFO MemoryStore: ensureFreeSpace(22692) called with curMem=163705, maxMem=280248975

15/04/22十八时15分22秒INFO MemoryStore的:阻止broadcast_0_piece0存储在内存中的字节(估计大小22.2 KB,免费267.1 MB)

15/04/22 18:15:22 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 22.2 KB, free 267.1 MB)

15/04/22十八时15分22秒INFO BlockManagerInfo:在内存中添加broadcast_0_piece0在localhost:44015(尺寸:22.2 KB,自由:267.2 MB)

15/04/22 18:15:22 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:44015 (size: 22.2 KB, free: 267.2 MB)

斯卡拉> textFile.count()

15/04/22 18点16分07秒INFO DAGScheduler:从0期提交2人失踪任务(README.md MapPartitionsRDD [1]在文本文件的位置:25)

15/04/22 18:16:07 INFO DAGScheduler: Submitting 2 missing tasks from Stage 0 (README.md MapPartitionsRDD[1] at textFile at :25)

15/04/22 18点16分07秒INFO TaskSchedulerImpl:添加任务设置0.0与2任务

15/04/22 18:16:07 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks

15/04/22 18点16分08秒INFO TaskSetManager:在第一阶段0.0(TID 0,ubuntu64server,PROCESS_LOCAL,1326字节)开始任务0.0

15/04/22 18:16:08 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, ubuntu64server, PROCESS_LOCAL, 1326 bytes)

15/04/22 18时16分23秒INFO TaskSetManager:在第一阶段0.0(TID 1,ubuntu64server,PROCESS_LOCAL,1326字节)开始任务1.0

15/04/22 18:16:23 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, ubuntu64server, PROCESS_LOCAL, 1326 bytes)

15/04/22 18时16分23秒WARN TaskSetManager:java.io.IOException异常:在第一阶段0.0(TID 0,ubuntu64server)丢失任务0.0无法连接到localhost / 127.0.0.1:44015
        在org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191)
        在org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)
    在org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78)
        在org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)
        在org.apache.spark.network.shuffle.RetryingBlockFetcher.access $ 200(RetryingBlockFetcher.java:43)
        在org.apache.spark.network.shuffle.RetryingBlockFetcher $ 1.run(RetryingBlockFetcher.java:170)
        在java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:471)
        在java.util.concurrent.FutureTask.run(FutureTask.java:262)
        在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        在java.util.concurrent.ThreadPoolExecutor中的$ Worker.run(ThreadPoolExecutor.java:615)
        在java.lang.Thread.run(Thread.java:745)

15/04/22 18:16:23 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, ubuntu64server): java.io.IOException: Failed to connect to localhost/127.0.0.1:44015 at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:191) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156) at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:78) at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140) at org.apache.spark.network.shuffle.RetryingBlockFetcher.access$200(RetryingBlockFetcher.java:43) at org.apache.spark.network.shuffle.RetryingBlockFetcher$1.run(RetryingBlockFetcher.java:170) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

推荐答案

找到一个变通此BlockManager本地主机问题由壳开始提供火花主地址(或可贝因火花defaults.conf)。

Found a work-around for this BlockManager localhost issue by providing spark master address at shell initiation (or can bein spark-defaults.conf).

./spark-shell --master spark://192.168.253.136:7077 

这样的话,我没有停止火花上下文和原始上下文能够从卡桑德拉表中读取文件以及读取数据。

This way, I didn't have to stop the spark context and the original context was able to read files as well as read data from Cassandra tables.

下面是BlockManager监听localhost日志(停止,动态创建上下文)的失败,无法连接异常

Here is the log of BlockManager listening on localhost (stop and dynamic creation of context) which fails with "Failed to connect exception"

15/04/25 07:10:27 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on localhost:40235 (size: 1966.0 B, free: 267.2 MB)

比较监听实际的服务器名称(如果在命令行提供的火花大师)其中工程

compare to listening on actual server name (if spark master provided at command line) which works

15/04/25 07:12:47 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on ubuntu64server2:33301 (size: 1966.0 B, free: 267.2 MB)

看起来像BlockManager code的错误时,在外壳动态创建的内容。

Looks like a bug in BlockManager code when context is dynamically created in the shell.

希望这可以帮助别人。

这篇关于壳牌火花上侦听本地主机,而不是配置的IP地址的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆