在本地计算机上运行Spark Streaming时出现“连接被拒绝"错误 [英] 'Connection Refused' error while running Spark Streaming on local machine

查看:844
本文介绍了在本地计算机上运行Spark Streaming时出现“连接被拒绝"错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道火花流连接被拒绝"问题已经有很多线程.但是其中大多数都在Linux中或至少指向HDFS.我正在使用Windows的本地笔记本电脑上运行它.

I know there are many threads already on 'spark streaming connection refused' issues. But most of these are in Linux or at least pointing to HDFS. I am running this on my local laptop with Windows.

我正在运行一个非常简单的基本Spark流式独立应用程序,以了解流式工作原理.在这里不做任何复杂的事情:-

I am running a very simple basic Spark streaming standalone application, just to see how the streaming works. Not doing anything complex here:-

import org.apache.spark.streaming.Seconds
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.SparkConf

object MyStream 
{
    def main(args:Array[String]) 
    {
        val sc = new StreamingContext(new SparkConf(),Seconds(10))
        val mystreamRDD = sc.socketTextStream("localhost",7777)
        mystreamRDD.print()
        sc.start()
        sc.awaitTermination()
    }
}

我收到以下错误:-

2015-07-25 18:13:07 INFO  ReceiverSupervisorImpl:59 - Starting receiver
2015-07-25 18:13:07 INFO  ReceiverSupervisorImpl:59 - Called receiver onStart
2015-07-25 18:13:07 INFO  SocketReceiver:59 - Connecting to localhost:7777
2015-07-25 18:13:07 INFO  ReceiverTracker:59 - Registered receiver for      stream 0 from 192.168.19.1:11300
2015-07-25 18:13:08 WARN  ReceiverSupervisorImpl:92 - Restarting receiver     with delay 2000 ms: Error connecting to localhost:7777
java.net.ConnectException: Connection refused

我尝试使用不同的端口号,但这无济于事.因此,它会不断重试循环,并不断出现相同的错误.有人有主意吗?

I have tried using different port numbers, but it doesn't help. So it keeps retrying in loop and keeps on getting same error. Does anyone have an idea?

推荐答案

socketTextStream的代码中,Spark创建了SocketInputDStream的实例,该实例使用java.net.Socket

Within the code for socketTextStream, Spark creates an instance of SocketInputDStream which uses java.net.Socket https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala#L73

java.net.Socket是一个客户端套接字,这意味着它期望已经有一个服务器在您指定的地址和端口上运行.除非您有某些服务在本地计算机的端口7777上运行服务器,否则您所看到的错误是预期的.

java.net.Socket is a client socket, which means it is expecting there to be a server already running at the address and port you specify. Unless you have some service running a server on port 7777 of your local machine, the error you are seeing is as expected.

要了解我的意思,请尝试以下操作(您可能不需要在环境中设置masterappName).

To see what I mean, try the following (you may not need to set master or appName in your environment).

import org.apache.spark.streaming.Seconds
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.SparkConf

object MyStream
{
  def main(args:Array[String])
  {
    val sc = new StreamingContext(new SparkConf().setMaster("local").setAppName("socketstream"),Seconds(10))
    val mystreamRDD = sc.socketTextStream("bbc.co.uk",80)
    mystreamRDD.print()
    sc.start()
    sc.awaitTermination()
  }
}

此操作不返回任何内容,因为该应用程序不向bbc网站说HTTP,但未收到拒绝连接的异常.

This doesn't return any content because the app doesn't speak HTTP to the bbc website but it does not get a connection refused exception.

要在linux上运行本地服务器,我将通过简单的命令(如

To run a local server when on linux, I would use netcat with a simple command such as

cat data.txt | ncat -l -p 7777

我不确定Windows中最好的方法是什么.您可以编写另一个应用程序,该应用程序在该端口上作为服务器侦听并发送一些数据.

I'm not sure what your best approach is in Windows. You could write another application which listens as a server on that port and sends some data.

这篇关于在本地计算机上运行Spark Streaming时出现“连接被拒绝"错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆