了解flink保存点&检查点 [英] Understanding flink savepoints & checkpoints

查看：555 发布时间：2020/5/26 19:41:03 scala persistence apache-flink failover checkpoint

本文介绍了了解flink保存点&检查点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

将Apache Flink流应用程序与这样的管道一起考虑:

Considering an Apache Flink streaming-application with a pipeline like this:

Kafka-Source -> flatMap 1 -> flatMap 2 -> flatMap 3 -> Kafka-Sink

其中每个flatMap函数都是无状态运算符(例如Datastream的常规.flatMap函数).

where every flatMap function is a non-stateful operator (e.g. the normal .flatMap function of a Datastream).

万一传入的消息将在flatMap 3挂起，检查点/保存点如何工作?从flatMap 1开始重新启动后，消息将被重新处理还是跳到flatMap 3?

How do checkpoints/savepoints work, in case an incoming message will be pending at flatMap 3? Will the message be reprocessed after restart beginning from flatMap 1 or will it skip to flatMap 3?

我有点困惑，因为，还是在失败/重新启动后重新处理整个管道?

I am a bit confused, because the documentation seems to refer to application state as what I can use in stateful operators, but I don't have stateful operators in my application. Is the "processing progress" saved & restored at all, or will the whole pipeline be re-processed after a failure/restart?

关于我以前的问题，失败(->从检查点恢复flink)和使用保存点手动重启之间有区别吗?

And this there a difference between a failure (-> flink restores from checkpoint) and manual restart using savepoints regarding my previous questions?

我试图通过在flatMap 3中放置Thread.sleep()并通过保存点取消作业来发现自己(使用EXACTLY_ONCE和rocksdb-backend启用了检查点).但是，这导致flink命令行工具挂起，直到sleep结束，甚至在执行flatMap 3并将其发送到接收器之前，工作被取消了.因此，似乎无法手动强制这种情况来分析flink的行为.

I tried finding out myself (with enabled checkpointing using EXACTLY_ONCE and rocksdb-backend) by placing a Thread.sleep() in flatMap 3 and then cancelling the job with a savepoint. However this lead to the flink commandline tool hanging until the sleep was over, and even then flatMap 3 was executed and even sent out to the sink before the job got cancelled. So it seems I can not manually force this situation to analyze flink's behaviour.

如果如上所述，处理进度"没有被检查点/保存点保存/覆盖，我如何确保到达我的管道的每条消息都永远不会重新使用任何给定的运算符(平面图1/2/3) -在重启/失败情况下处理?

In case "processing progress" is not saved/covered by the checkpointing/savepoints as I described above, how could I make sure for every message reaching my pipeline that any given operator (flatmap 1/2/3) is never re-processed in a restart/failure situation?

了解flink保存点&检查点 [英] Understanding flink savepoints & checkpoints

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

了解flink保存点&amp;检查点 [英] Understanding flink savepoints &amp; checkpoints

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

了解flink保存点&检查点 [英] Understanding flink savepoints & checkpoints

登录关闭