Apache Flink 将 S3 用于后端状态和检查点 [英] Apache Flink to use S3 for backend state and checkpoints

查看：86 发布时间：2021/11/12 0:57:31 amazon-s3 apache-flink flink-streaming checkpoint checkpointing

本文介绍了Apache Flink 将 S3 用于后端状态和检查点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我计划使用 S3 来使用 FsStateBackend 存储 Flink 的检查点.但不知何故，我收到了以下错误.

I was planning to use S3 to store the Flink's checkpoints using the FsStateBackend. But somehow I was getting the following error.

错误

org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 's3'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.

Flink 版本:我使用的是 Flink 1.10.0 版本.

Flink version: I am using Flink 1.10.0 version.

推荐答案

我已经找到了上述问题的解决方案，所以我在此列出了所需的步骤.

I have found the solution for the above issue, so here I am listing it in steps that are required.

我们需要在我在下面列出的 flink-conf.yaml 文件中添加一些配置.

We need to add some configs in the flink-conf.yaml file which I have listed below.

state.backend: filesystem
state.checkpoints.dir: s3://s3-bucket/checkpoints/ #"s3://<your-bucket>/<endpoint>"
state.backend.fs.checkpointdir: s3://s3-bucket/checkpoints/ #"s3://<your-bucket>/<endpoint>"


s3.access-key: XXXXXXXXXXXXXXXXXXX #your-access-key
s3.secret-key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx #your-secret-key

s3.endpoint: http://127.0.0.1:9000 #your-endpoint-hostname (I have used Minio)

完成第一步后，我们需要复制各自的(flink-s3-fs-hadoop-1.10.0.jar 和 flink-s3-fs-presto-1.10.0.jar) 从 opt 目录到你 Flink 的 plugins 目录的 JAR 文件.

After completing the first step we need to copy the respective(flink-s3-fs-hadoop-1.10.0.jar and flink-s3-fs-presto-1.10.0.jar) JAR files from the opt directory to the plugins directory of your Flink.

例如:-->1.将/flink-1.10.0/opt/flink-s3-fs-hadoop-1.10.0.jar复制到/flink-1.10.0/plugins/s3-fs-hadoop/flink-s3-fs-hadoop-1.10.0.jar //推荐用于StreamingFileSink
2.将/flink-1.10.0/opt/flink-s3-fs-presto-1.10.0.jar复制到/flink-1.10.0/plugins/s3-fs-presto/flink-s3-fs-presto-1.10.0.jar //推荐用于检查点

E.g:--> 1. Copy /flink-1.10.0/opt/flink-s3-fs-hadoop-1.10.0.jar to /flink-1.10.0/plugins/s3-fs-hadoop/flink-s3-fs-hadoop-1.10.0.jar // Recommended for StreamingFileSink
2. Copy /flink-1.10.0/opt/flink-s3-fs-presto-1.10.0.jar to /flink-1.10.0/plugins/s3-fs-presto/flink-s3-fs-presto-1.10.0.jar //Recommended for checkpointing

在检查点代码中添加这个

Add this in checkpointing code

env.setStateBackend(new FsStateBackend("s3://s3-bucket/checkpoints/"))

完成上述所有步骤后，如果 Flink 已经在运行，请重新启动它.

注意:

如果您在 Flink 中同时使用(flink-s3-fs-hadoop 和 flink-s3-fs-presto)，那么请使用 s3p:// 专门用于 flink-s3-fs-presto 和 s3a:// 用于 flink-s3-fs-hadoop 而不是s3://.
欲了解更多详情，请点击此处.

If you are using both(flink-s3-fs-hadoop and flink-s3-fs-presto) in Flink then please use s3p:// specificly for flink-s3-fs-presto and s3a:// for flink-s3-fs-hadoop instead of s3://.
For more details click here.

这篇关于Apache Flink 将 S3 用于后端状态和检查点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Apache Flink 将 S3 用于后端状态和检查点 [英] Apache Flink to use S3 for backend state and checkpoints

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Apache Flink 将 S3 用于后端状态和检查点 [英] Apache Flink to use S3 for backend state and checkpoints

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭