试图获得火花流以从网站读取数据流,套接字是什么? [英] Trying to get spark streaming to read data stream from website, what is the socket?
问题描述
我正在尝试获取此数据 http://stream.meetup.com/2/rsvps进入火花流
I am trying to get this data http://stream.meetup.com/2/rsvps into spark stream
它们是JSON对象,我知道行将是字符串,我只希望它在尝试JSON之前起作用.
They are JSON objects, I know the lines will be strings, I just want it to work before I try JSON.
我不确定该放置什么端口,我认为这是问题所在.
I am not sure what to put as the port, I assume that is the problem.
SparkConf conf = new SparkConf().setMaster("local[2]").setAppName("Spark Streaming");
JavaStreamingContext jssc = new JavaStreamingContext(conf, Durations.seconds(1));
JavaReceiverInputDStream<String> lines = jssc.socketTextStream("http://stream.meetup.com/2/rsvps", 80);
lines.print();
jssc.start();
jssc.awaitTermination();
这是我的错误
java.net.UnknownHostException: http://stream.meetup.com/2/rsvps
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:178)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:172)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.<init>(Socket.java:425)
at java.net.Socket.<init>(Socket.java:208)
推荐答案
socketTextStream并非旨在用作http客户端.如您所见,您将需要创建一个自定义接收器,一个可能的起点是基于作为Meetup流数据源的一部分创建的接收器(请参阅
The socketTextStream isn't designed to work as an http client. As you noticed, you will need to create a custom receiver, one potential place to start is based on the receiver created as part of the meetup streaming data source (see https://github.com/actions/meetup-stream/blob/master/src/main/scala/receiver/MeetupReceiver.scala ).
这篇关于试图获得火花流以从网站读取数据流,套接字是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!