Flink-RocksDB中的localdir配置是什么? [英] Flink - What is localdir configuration in RocksDB?

查看:75
本文介绍了Flink-RocksDB中的localdir配置是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是flink的新手,并且对状态后端配置有些困惑.

I'm new to flink and I have some confusion about the state backend configuration.

据我所知,RocksDB将应用程序的所有状态保存在文件系统上.我使用s3存储状态,因此我同时配置了 state.checkpoints.dir state.savepoints.dir 指向我的s3存储桶.现在,我发现还有一个与RocksDB存储相关的选项,称为 state.backend.rocksdb.localdir .这是什么目的?(我看到我不能为此使用s3)另外,如果RocksDB使用本地机器存储来存储某些内容,那么当我使用Kubernetes时,我的pod突然失败了怎么办?我应该使用永久性存储吗?

As far as I know, RocksDB saves all of the application's state on the filesystem. I use s3 to store the state, so I configured both state.checkpoints.dir and state.savepoints.dir pointed to my s3 bucket. Now I see that there is another option related to RocksDB storage called state.backend.rocksdb.localdir. What is the purpose of this?(I saw I can't use s3 for this) Also, if RocksDB uses the local machine storage for something, what will be when I use Kubernetes and my pod suddenly failed? should I use persistent storage?

另一件事,我不确定我是否正确理解所有状态事物.检查点是否保存我的所有状态?例如,当我使用AggregationFunction且应用程序失败时,在还原应用程序时,是否会还原每个键的聚合值?

Another thing, I'm not sure I understood all the state things correctly. Does the checkpoint save all of my state? For example, when I use AggregationFunction and the application failed, when the application restored, does the aggregated value for each key is restored?

推荐答案

每个Flink的状态后端都将其工作状态保持在每个工作人员本地的某个位置,同时将检查点持久保存在某个持久的位置(例如S3).对于基于堆的状态后端,工作状态作为对象存储在JVM堆上,而对于RocksDB,工作状态作为序列化的字节存储在本地磁盘上(带有内存中的堆外缓存).出于性能原因,您不希望对 state.backend.rocksdb.localdir 使用S3(甚至是网络连接的存储).如果可以,请使用本地SSD存储.

Each of Flink's state backends keeps its working state somewhere local to each worker, while persisting the checkpoints somewhere durable, such as S3. With the heap-based state backend, the working state is stored as objects on the JVM heap, while with RocksDB the working state is stored as serialized bytes on the local disk (with an in-memory, off-heap cache). For performance reasons you don't want to use S3 (or even network-attached storage) for state.backend.rocksdb.localdir. Use local SSD storage if you can.

是的,如果您的应用程序失败,则将恢复AggregationFunction中每个键的合计值.检查点包括所有内容,包括源和接收器,窗口,计时器,ProcessFunctions,RichFunctions等保留的状态.

Yes, the aggregated value for each key in an AggregationFunction will be restored, should your application fail. The checkpoints include everything, including state kept by the sources and sinks, windows, timers, ProcessFunctions, RichFunctions, etc.

这篇关于Flink-RocksDB中的localdir配置是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆