了解 flink 保存点 &检查站 [英] Understanding flink savepoints & checkpoints

查看：53 发布时间：2021/11/12 1:13:24 scala persistence apache-flink failover checkpoint

本文介绍了了解 flink 保存点 &检查站的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

考虑一个带有如下管道的 Apache Flink 流应用程序:

Considering an Apache Flink streaming-application with a pipeline like this:

Kafka-Source -> flatMap 1 -> flatMap 2 -> flatMap 3 -> Kafka-Sink

其中每个 flatMap 函数都是无状态操作符(例如，Datastream 的普通 .flatMap 函数).

where every flatMap function is a non-stateful operator (e.g. the normal .flatMap function of a Datastream).

检查点/保存点如何工作，以防传入消息在 flatMap 3 处待处理?从flatMap 1开始重新启动后消息会被重新处理还是会跳到flatMap 3?

How do checkpoints/savepoints work, in case an incoming message will be pending at flatMap 3? Will the message be reprocessed after restart beginning from flatMap 1 or will it skip to flatMap 3?

我有点困惑，因为documentation 似乎将应用程序状态称为我可以在有状态运算符中使用的内容，但我的应用程序中没有有状态运算符.是否保存了处理进度"&完全恢复，还是会在失败/重启后重新处理整个管道?

I am a bit confused, because the documentation seems to refer to application state as what I can use in stateful operators, but I don't have stateful operators in my application. Is the "processing progress" saved & restored at all, or will the whole pipeline be re-processed after a failure/restart?

关于我之前的问题，失败(-> flink 从检查点恢复)和使用保存点手动重启之间有区别吗?

And this there a difference between a failure (-> flink restores from checkpoint) and manual restart using savepoints regarding my previous questions?

我尝试通过在 flatMap 3 中放置 Thread.sleep() 来找出自己(使用 EXACTLY_ONCE 和 Rocksdb-backend 启用检查点)> 然后使用保存点取消作业.然而，这导致 flink 命令行工具挂起，直到 sleep 结束，即使这样 flatMap 3 被执行，甚至发送到接收器之前作业被取消.所以看来我不能手动强制这种情况来分析flink的行为.

I tried finding out myself (with enabled checkpointing using EXACTLY_ONCE and rocksdb-backend) by placing a Thread.sleep() in flatMap 3 and then cancelling the job with a savepoint. However this lead to the flink commandline tool hanging until the sleep was over, and even then flatMap 3 was executed and even sent out to the sink before the job got cancelled. So it seems I can not manually force this situation to analyze flink's behaviour.

如果我上面描述的检查点/保存点没有保存/覆盖处理进度"，我如何确保每条消息到达我的管道，任何给定的操作符(平面图 1/2/3)永远不会重新-在重启/失败情况下处理?

In case "processing progress" is not saved/covered by the checkpointing/savepoints as I described above, how could I make sure for every message reaching my pipeline that any given operator (flatmap 1/2/3) is never re-processed in a restart/failure situation?

了解 flink 保存点 &检查站 [英] Understanding flink savepoints & checkpoints

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

了解 flink 保存点 &amp;检查站 [英] Understanding flink savepoints &amp; checkpoints

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

了解 flink 保存点 &检查站 [英] Understanding flink savepoints & checkpoints

登录关闭