关于StateTtlConfig [英] About StateTtlConfig

查看:109
本文介绍了关于StateTtlConfig的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为MapState配置StateTtlConfig,我感兴趣的是进入状态的对象例如具有3小时的生命,然后它们应从状态中消失并传递给GC进行清理并释放一些内存和检查点我认为也应该释放一些重量.我以前有此配置,但似乎无法正常工作,因为检查点始终在增长:

I'm configuring my StateTtlConfig for MapState and my interest is the objects into the state has for example 3 hours of life and then they should disappear from state and passed to the GC to be cleaned up and release some memory and the checkpoints should release some weight too I think. I had this configuration before and it seems like it was not working because the checkpoints where always growing up:

private final StateTtlConfig ttlConfig = StateTtlConfig.newBuilder(org.apache.flink.api.common.time.Time.hours(3)).cleanupFullSnapshot().build();

然后,我意识到该配置仅在从保存点读取状态时才有效,但在我的方案中不起作用.我将TTL配置更改为此:

Then I realized the that configuration works only when reading states from a savepoints but not in my scenario. I'd change my TTL configuration to this one:

private final StateTtlConfig ttlConfig = StateTtlConfig.newBuilder(org.apache.flink.api.common.time.Time.hours(3))
            .setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired).build();

基于这样的想法,我想在定义的时间后清除所有键的所有状态.

Based on the idea that I want to clean all the states for all keys after a defined time.

我的问题是:

  1. 我现在正在正确配置吗?
  2. 最好的方法是什么?

再感谢一次.亲切的问候!

Thanks one more time. Kind regards!!!

推荐答案

对于您的用例,我不太了解,因此无法推荐特定的到期/清除策略,但是我可以提供一些注意事项.

I don't know enough about your use case to recommend a specific expiration/cleanup policy, but I can offer a few notes.

我的理解是, cleanupFullSnapshot()指定除了进行其他任何清理工作之外,每当拍摄快照时都将进行完全清理工作.

My understanding is that cleanupFullSnapshot() specifies that in addition to whatever other cleanup is being done, a full cleanup will be done whenever taking a snapshot.

FsStateBackend使用增量清理策略.默认情况下,它在每个状态访问期间检查5个条目,并且在记录处理期间不进行其他清理.如果您的工作量使得写入的次数多于读取的次数,那可能还不够.如果对该状态没有访问权限,则过期状态将继续存在.假设您确实有一定级别的状态访问正在进行,则选择 cleanupIncrementally(10,false)将使清除更具攻击性.

The FsStateBackend uses the incremental cleanup strategy. By default it checks 5 entries during each state access, and does no additional cleanup during record processing. If your workload is such that there are many more writes than reads, that might not be enough. If no access happens to the state, expired state will persist. Choosing cleanupIncrementally(10, false) will make the cleanup more aggressive, assuming you do have some level of state access going on.

检查点大小增加,或者花费比预期达到稳定水平更长的时间并不少见.仅仅是键空间在增长吗?

It's not unusual for checkpoint sizes to grow, or to take longer than you'd expect to reach a plateau. Could it simply be that the keyspace is growing?

https://flink.apache.org/2019/05/19/state-ttl.html 是了解Flink的状态TTL机制的好资源.

https://flink.apache.org/2019/05/19/state-ttl.html is a good resource for learning more about Flink's State TTL mechanism.

这篇关于关于StateTtlConfig的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆