在本地计算机上运行Spark Streaming时出现“连接被拒绝"错误 [英] 'Connection Refused' error while running Spark Streaming on local machine
问题描述
我知道火花流连接被拒绝"问题已经有很多线程.但是其中大多数都在Linux中或至少指向HDFS.我正在使用Windows的本地笔记本电脑上运行它.
I know there are many threads already on 'spark streaming connection refused' issues. But most of these are in Linux or at least pointing to HDFS. I am running this on my local laptop with Windows.
我正在运行一个非常简单的基本Spark流式独立应用程序,以了解流式工作原理.在这里不做任何复杂的事情:-
I am running a very simple basic Spark streaming standalone application, just to see how the streaming works. Not doing anything complex here:-
import org.apache.spark.streaming.Seconds
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.SparkConf
object MyStream
{
def main(args:Array[String])
{
val sc = new StreamingContext(new SparkConf(),Seconds(10))
val mystreamRDD = sc.socketTextStream("localhost",7777)
mystreamRDD.print()
sc.start()
sc.awaitTermination()
}
}
我收到以下错误:-
2015-07-25 18:13:07 INFO ReceiverSupervisorImpl:59 - Starting receiver
2015-07-25 18:13:07 INFO ReceiverSupervisorImpl:59 - Called receiver onStart
2015-07-25 18:13:07 INFO SocketReceiver:59 - Connecting to localhost:7777
2015-07-25 18:13:07 INFO ReceiverTracker:59 - Registered receiver for stream 0 from 192.168.19.1:11300
2015-07-25 18:13:08 WARN ReceiverSupervisorImpl:92 - Restarting receiver with delay 2000 ms: Error connecting to localhost:7777
java.net.ConnectException: Connection refused
我尝试使用不同的端口号,但这无济于事.因此,它会不断重试循环,并不断出现相同的错误.有人有主意吗?
I have tried using different port numbers, but it doesn't help. So it keeps retrying in loop and keeps on getting same error. Does anyone have an idea?
推荐答案
在socketTextStream
的代码中,Spark创建了SocketInputDStream
的实例,该实例使用java.net.Socket
Within the code for socketTextStream
, Spark creates an instance of SocketInputDStream
which uses java.net.Socket
https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala#L73
java.net.Socket
是一个客户端套接字,这意味着它期望已经有一个服务器在您指定的地址和端口上运行.除非您有某些服务在本地计算机的端口7777上运行服务器,否则您所看到的错误是预期的.
java.net.Socket
is a client socket, which means it is expecting there to be a server already running at the address and port you specify. Unless you have some service running a server on port 7777 of your local machine, the error you are seeing is as expected.
要了解我的意思,请尝试以下操作(您可能不需要在环境中设置master
或appName
).
To see what I mean, try the following (you may not need to set master
or appName
in your environment).
import org.apache.spark.streaming.Seconds
import org.apache.spark.streaming.StreamingContext
import org.apache.spark.SparkConf
object MyStream
{
def main(args:Array[String])
{
val sc = new StreamingContext(new SparkConf().setMaster("local").setAppName("socketstream"),Seconds(10))
val mystreamRDD = sc.socketTextStream("bbc.co.uk",80)
mystreamRDD.print()
sc.start()
sc.awaitTermination()
}
}
此操作不返回任何内容,因为该应用程序不向bbc网站说HTTP,但未收到拒绝连接的异常.
This doesn't return any content because the app doesn't speak HTTP to the bbc website but it does not get a connection refused exception.
要在linux上运行本地服务器,我将通过简单的命令(如
To run a local server when on linux, I would use netcat with a simple command such as
cat data.txt | ncat -l -p 7777
我不确定Windows中最好的方法是什么.您可以编写另一个应用程序,该应用程序在该端口上作为服务器侦听并发送一些数据.
I'm not sure what your best approach is in Windows. You could write another application which listens as a server on that port and sends some data.
这篇关于在本地计算机上运行Spark Streaming时出现“连接被拒绝"错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!