Spark结构化流:多个接收器 [英] Spark Structured streaming: multiple sinks

查看：76 发布时间：2021/4/8 19:35:09 apache-spark spark-structured-streaming

本文介绍了Spark结构化流:多个接收器的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我们从Kafka那里使用结构化流，并将处理后的数据集写入s3.

We are consuming from Kafka using structured streaming and writing the processed data set to s3.

我们还希望将处理后的数据向前写入Kafka，是否可以从同一流查询中进行处理?(火花版本2.1.1)

We also want to write the processed data to Kafka moving forward, is it possible to do it from the same streaming query ? (spark version 2.1.1)

在日志中，我看到了流式查询进度输出，并且从日志中获得了一个示例持续时间JSON，能否有人请更清楚地说明 addBatch 和 getBatch ?

In the logs, I see the streaming query progress output and I have a sample duration JSON from the log, can some one please provide more clarity on what the difference is between addBatch and getBatch?

TriggerExecution-处理获取的数据和写入接收器都需要时间吗?

TriggerExecution - is it the time take to both process the fetched data and writing to the sink?

"durationMs" : {
    "addBatch" : 2263426,
    "getBatch" : 12,
    "getOffset" : 273,
   "queryPlanning" : 13,
    "triggerExecution" : 2264288,
    "walCommit" : 552
},

Spark结构化流:多个接收器 [英] Spark Structured streaming: multiple sinks

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spark结构化流:多个接收器 [英] Spark Structured streaming: multiple sinks

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭