TaskManager的Flink状态后端 [英] Flink state backend for TaskManager

查看:215
本文介绍了TaskManager的Flink状态后端的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Flink v1.2设置,其中包含1个JobManager,2个TaskManager,每个均在其自己的VM中.对于上述每个主机,我将状态后端配置为文件系统,并将其指向本地位置(state.backend.fs.checkpointdir:file:///home/ubuntu/Prototype/flink/flink-checkpoints).我将并行度设置为1,每个taskanager有1个插槽. 然后,我在JobManager上运行事件处理作业,将其分配给TaskManager. 我杀死了运行该作业的TaskManager,并在失败的TaskManager Flink上进行了几次失败尝试之后,尝试在其余TaskManager上运行该作业.此时,它再次失败,因为它找不到对应的检查点/状态:java.io.FileNotFoundException:/home/ubuntu/Prototype/flink/flink-checkpoints/56c409681baeaf205bc1ba6cbe9f8091/chk-14/46f6e71d-ebfe-4b49-bf35-23c2e7f97923 (没有这样的文件或目录)

I have a Flink v1.2 setup with 1 JobManager, 2 TaskManagers each in it's own VM. I configured the state backend to filesystem and pointed it to a local location in the case of each of the above hosts (state.backend.fs.checkpointdir: file:///home/ubuntu/Prototype/flink/flink-checkpoints). I have set parallelism to 1 and each taskanager has 1 slot. I then run an event processing job on the JobManager which assigns it to a TaskManager. I kill the TaskManager running the job and after a few unsuccessful attempts on the failed TaskManager Flink tries to run the job on the remaining TaskManager. At this point it fails again because it cannot find the corresponding checkpoints / state : java.io.FileNotFoundException: /home/ubuntu/Prototype/flink/flink-checkpoints/56c409681baeaf205bc1ba6cbe9f8091/chk-14/46f6e71d-ebfe-4b49-bf35-23c2e7f97923 (No such file or directory)

文件夹/home/ubuntu/Prototype/flink/flink-checkpoints/56c409681baeaf205bc1ba6cbe9f8091仅存在于我杀死的TaskManager上,而不存在于另一个文件夹上.

The folder /home/ubuntu/Prototype/flink/flink-checkpoints/56c409681baeaf205bc1ba6cbe9f8091 only exists on the TaskManager that I killed and not on the other one.

我的问题是,如果我想要上述功能,是否应该在所有任务管理器上设置相同的检查点/状态位置?

My question is am I supposed to set the same location for checkpointing / state on all the task managers if I want the above functionality?

谢谢!

推荐答案

您使用的检查点目录需要在组成Flink群集的所有计算机上共享.通常,这类似于HDFS或S3,但可以是任何共享文件系统.

The checkpoint directory you use needs to be shared across all machines that make up your Flink cluster. Typically this would be something like HDFS or S3 but can be any shared filesystem.

这篇关于TaskManager的Flink状态后端的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆