关于Flink一次功能 [英] About Flink exactly-once feature

查看:170
本文介绍了关于Flink一次功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读有关flink一次功能

I am reading the documentation about flink exactly-once feature here. And I do not quite understand some of the sentences:

在成功进行预提交之后,必须保证提交最终成功–我们的操作员和我们的外部系统都需要做出此保证.如果提交失败(例如,由于间歇性的网络问题),则整个Flink应用程序都会失败,并根据用户的重新启动策略重新启动,并且还会进行另一次提交尝试.此过程至关重要,因为如果提交最终未能成功,则会发生数据丢失.

After a successful pre-commit, the commit must be guaranteed to eventually succeed – both our operators and our external system need to make this guarantee. If a commit fails (for example, due to an intermittent network issue), the entire Flink application fails, restarts according to the user’s restart strategy, and there is another commit attempt. This process is critical because if the commit does not eventually succeed, data loss occurs.

这表示如果提交最终未成功,则会发生数据丢失.我将其解释为:提交可以成功,但是由于某些原因,每次重新启动时都会失败.在这种情况下,Flink只能放弃属于该提交的数据.因此,如果数据丢失是不可接受的,则应重新启动应用程序,直到提交成功?

This says data loss occurs if the commit does not eventually succeed. I interpret it as: The commit could succeed but it just happen to keep failing for every restart because of certain reason. In this case, Flink can only give up on the data belonging to this commit. So, if data loss is unacceptable, the application should be restarted until the commit succeeds ?

我们知道,如果出现任何故障,Flink会将应用程序的状态还原到最新的成功检查点.在极少数情况下,如果失败发生在成功的预提交之后但在该事实(提交)的通知到达我们的操作员之前发生,则是一个潜在的陷阱.在这种情况下,Flink会将我们的操作员还原到已经预先提交但尚未提交的状态.

As we know, if there’s any failure, Flink restores the state of the application to the latest successful checkpoint. One potential catch is in a rare case when the failure occurs after a successful pre-commit but before notification of that fact (a commit) reaches our operator. In that case, Flink restores our operator to the state that has already been pre-committed but not yet committed.

我在这里也不太听.这是什么通知,上面没有提到?所说的算子是水槽算子吗?另外,按照我的解释,如果提交成功并且只有所谓的通知失败,那么还原到预提交状态后是否会导致数据重复?

I do not quite follow here, either. What's this notification about, which is not mentioned above ? And does the said operator mean the sink operator ? Also, as I interpret it, if the commit has succeeded and only the so-called notification fails, would it cause data duplication after restoration to the pre-commited state ?

如果问题本身无效,请纠正我.任何帮助表示赞赏.

Please correct me if the question itself is not valid. Any help is appreciated.

推荐答案

Flink的端到端精确一次机制基于类似协议的两阶段提交(2PC).该协议用于协调程序是否没有接收器或所有接收器都将输出提交到外部系统.

Flink's end-to-end exactly-once mechanism is based on a two phase commit (2PC) like protocol. The protocol is used to coordinate that either none or all sinks of a program commit output to an external system.

当接收器任务显示"我准备提交"(预先提交)时,它可以保证能够执行提交.然后,接收器任务等待接收来自协调器的提交通知,仅当所有接收器任务都同意准备提交时才发送该通知.如果在收到通知之前应用程序失败,则保证也必须成立.在这种情况下,接收器任务必须能够恢复打开的(尚未提交)事务,并在收到下一个通知时执行该事务.万一发生多次失败,接收器必须继续尝试直到提交成功.但是,即使在一个(或多个)失败的情况下,事务也只能执行一次.

When a sink task says "I am ready to commit" (pre-commit), it gives the guarantee that it is able to perform the commit. The sink task then waits to receive a commit notification from the coordinator which is only sent if all sink tasks agreed on being ready to commit. The guarantee must also hold if the application fails before the notification is received. In that case, the sink task must be able to recover the open (not-yet-committed) transaction and execute it when the next notification is received. In case of multiple failures, the sink must keep on trying until the commit succeeds. However, the transaction must only be performed once, even in case of one (or more) failure.

这就是

预提交成功后,必须保证提交最终成功

After a successful pre-commit, the commit must be guaranteed to eventually succeed

如果接收器任务最终无法提交其预先提交的数据,则数据将丢失.

If a sink task is not able to eventually commit data that it pre-committed, the data is lost.

这篇关于关于Flink一次功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆