DStream的笛卡尔 [英] Cartesian of DStream

查看:77
本文介绍了DStream的笛卡尔的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用Spark笛卡尔函数来生成N对值的列表.

I use Spark cartesian function to to generate a list N pairs of values.

然后我映射这些值以生成每个用户之间的距离度量:

I then map over these values to generate a distance metric between each of the users :

val cartesianUsers: org.apache.spark.rdd.RDD[(distance.classes.User, distance.classes.User)] = users.cartesian(users)
cartesianUsers.map(m => manDistance(m._1, m._2))

这按预期工作.

使用Spark Streaming库,我创建一个DStream,然后在其上进行映射:

Using Spark Streaming library I create a DStream and then map over it :

val customReceiverStream: ReceiverInputDStream[String] = ssc.receiverStream....
customReceiverStream.foreachRDD(m => {
  println("size is " + m)
})

我可以在customReceiverStream.foreachRDD中使用笛卡尔函数,但是根据doc http://spark.apache.org/docs/1.2.0/streaming-programming-guide.htm 这不是其预期用途:

I could use cartesian function within customReceiverStream.foreachRDD but according to doc http://spark.apache.org/docs/1.2.0/streaming-programming-guide.htm this is not its intended use :

foreachRDD(func)最通用的输出运算符,将函数 func应用于从流生成的每个RDD.此功能应将每个RDD中的数据推送到外部系统,例如将RDD保存到文件或通过网络将其写入数据库.请注意,函数func是在运行流应用程序的驱动程序进程中执行的,通常在其中具有RDD操作,这将强制计算流RDD.

foreachRDD(func) The most generic output operator that applies a function, func, to each RDD generated from the stream. This function should push the data in each RDD to a external system, like saving the RDD to files, or writing it over the network to a database. Note that the function func is executed in the driver process running the streaming application, and will usually have RDD actions in it that will force the computation of the streaming RDDs.

如何计算DStream的笛卡尔数?也许我误解了DStreams的使用?

How to compute the cartesian of a DStream ? Perhaps I'm misunderstanding the use of DStreams ?

推荐答案

我不知道transform方法:

I wasn't aware of transform method :

cartesianUsers.transform(car => car.cartesian(car))

精彩的演讲还提到了大约17:00的转换功能 https://www.youtube.com/watch?v = g171ndOHgJ0

Nice talk which also mentions transform function at approx 17:00 https://www.youtube.com/watch?v=g171ndOHgJ0

这篇关于DStream的笛卡尔的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆