在将 RDD 发布到 Kafka 之前在 Spark 中对其进行排序? [英] Sort RDD in Spark before publishing it to Kafka?

查看：21 发布时间：2021/11/12 3:19:48 scala apache-spark apache-kafka

本文介绍了在将 RDD 发布到 Kafka 之前在 Spark 中对其进行排序?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在我的代码中，我首先订阅了一个 Kafka 流，处理每个 RDD 以创建我的类 People 的实例，然后，我想发布结果集(Dataset[People]) 到 Kafka 的特定主题.需要注意的是，并非每条从 Kafka 收到的传入消息都映射到 People 的实例.此外，人的实例应该按照从 Kafka 接收到的完全相同的顺序发送到 Kafka.

In my code, I first subscribe to a Kafka stream, process each RDD to create an instance of my class People and then, I want to publish the result set (Dataset[People]) to a specific topic to Kafka. It is important to note that not every incoming message received from Kafka maps to an instance of People. Moreover, instances of people should be sent to Kafka in exactly the same order as received from Kafka.

但是，我不确定排序是否真的有必要，或者 People 的实例在执行器上运行各自的代码时是否保持相同的顺序(我可以直接将我的数据集发布到 Kafka).据我了解，排序是必要的，因为foreachRDD里面的代码可以在集群的不同节点上执行.这是正确的吗?

However, I am not sure if sorting is really necessary or if the instances of People maintain the same order when the respective code is run on the executors (and I can directly publish my Dataset to Kafka). As far as I understand, sorting is necessary, because the code inside foreachRDD can be executed on different nodes in the cluster. Is this correct?

这是我的代码:

val myStream = KafkaUtils.createDirectStream[K, V](streamingContext, PreferConsistent, Subscribe[K, V](topics, consumerConfig))

def process(record: (RDD[ConsumerRecord[String, String]], Time)): Unit = record match {
case (rdd, time) if !rdd.isEmpty =>
    // More Code...
    // In the end, I have: Dataset[People]
case _ =>
}

myStream.foreachRDD((x, y) => process((x, y))) // Do I have to replace this call with map, sort the RDD and then publish it to Kafka?

在将 RDD 发布到 Kafka 之前在 Spark 中对其进行排序? [英] Sort RDD in Spark before publishing it to Kafka?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在将 RDD 发布到 Kafka 之前在 Spark 中对其进行排序? [英] Sort RDD in Spark before publishing it to Kafka?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭