关于 StateTtlConfig [英] About StateTtlConfig

查看:22
本文介绍了关于 StateTtlConfig的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为 MapState 配置我的 StateTtlConfig,我感兴趣的是进入状态的对象有 3 个小时的生命,然后它们应该从状态中消失并传递给 GC 进行清理并释放一些内存和检查点我认为也应该释放一些重量.我之前有这个配置,但它似乎不起作用,因为检查点总是在增长:

I'm configuring my StateTtlConfig for MapState and my interest is the objects into the state has for example 3 hours of life and then they should disappear from state and passed to the GC to be cleaned up and release some memory and the checkpoints should release some weight too I think. I had this configuration before and it seems like it was not working because the checkpoints where always growing up:

private final StateTtlConfig ttlConfig = StateTtlConfig.newBuilder(org.apache.flink.api.common.time.Time.hours(3)).cleanupFullSnapshot().build();

然后我意识到该配置仅在从保存点读取状态时才有效,但在我的场景中无效.我会将我的 TTL 配置更改为此:

Then I realized the that configuration works only when reading states from a savepoints but not in my scenario. I'd change my TTL configuration to this one:

private final StateTtlConfig ttlConfig = StateTtlConfig.newBuilder(org.apache.flink.api.common.time.Time.hours(3))
            .setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired).build();

基于我想在定义的时间后清除所有键的所有状态的想法.

Based on the idea that I want to clean all the states for all keys after a defined time.

我的问题是:

  1. 我现在的配置是否正确?
  2. 最好的方法是什么?

再次感谢.亲切的问候!!!

Thanks one more time. Kind regards!!!

推荐答案

我对您的用例了解不够,无法推荐特定的过期/清理策略,但我可以提供一些说明.

I don't know enough about your use case to recommend a specific expiration/cleanup policy, but I can offer a few notes.

我的理解是,cleanupFullSnapshot() 指定除了正在执行的任何其他清理工作之外,每次拍摄快照时都会进行一次完整清理.

My understanding is that cleanupFullSnapshot() specifies that in addition to whatever other cleanup is being done, a full cleanup will be done whenever taking a snapshot.

FsStateBackend 使用增量清理策略.默认情况下,它在每次状态访问期间检查 5 个条目,并且在记录处理期间不进行额外的清理.如果您的工作负载是写入多于读取,那可能还不够.如果没有对该状态进行访问,则过期状态将持续存在.选择 cleanupIncrementally(10, false) 将使清理更加积极,假设您确实有某种级别的状态访问正在进行.

The FsStateBackend uses the incremental cleanup strategy. By default it checks 5 entries during each state access, and does no additional cleanup during record processing. If your workload is such that there are many more writes than reads, that might not be enough. If no access happens to the state, expired state will persist. Choosing cleanupIncrementally(10, false) will make the cleanup more aggressive, assuming you do have some level of state access going on.

检查点大小增加,或者达到稳定状态所需的时间比您预期的要长,这并不少见.难道仅仅是密钥空间在增长?

It's not unusual for checkpoint sizes to grow, or to take longer than you'd expect to reach a plateau. Could it simply be that the keyspace is growing?

https://flink.apache.org/2019/05/19/state-ttl.html 是学习更多 Flink 状态 TTL 机制的好资源.

https://flink.apache.org/2019/05/19/state-ttl.html is a good resource for learning more about Flink's State TTL mechanism.

这篇关于关于 StateTtlConfig的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆