原始状态与托管状态的示例 [英] Example of raw vs managed state

查看:76
本文介绍了原始状态与托管状态的示例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解原始状态和托管状态之间的区别.从文档中:

I am trying to understand the difference between raw and managed state. From the docs:

键控状态和操作员状态以两种形式存在:受管状态和原始状态.

Keyed State and Operator State exist in two forms: managed and raw.

受管状态表示为由Flink运行时,例如内部哈希表或RocksDB.例子是"ValueState","ListState"等.Flink的运行时对状态进行编码,将它们写入检查点.

Managed State is represented in data structures controlled by the Flink runtime, such as internal hash tables, or RocksDB. Examples are "ValueState", "ListState", etc. Flink’s runtime encodes the states and writes them into the checkpoints.

原始状态是操作员保留其自己的数据结构的状态.当经过检查点时,它们仅将字节序列写入检查点.Flink对州的数据结构一无所知,只能看到原始字节.

Raw State is state that operators keep in their own data structures. When checkpointed, they only write a sequence of bytes into the checkpoint. Flink knows nothing about the state’s data structures and sees only the raw bytes.

但是,我还没有发现任何突出差异的示例.任何人都可以提供一个最小的示例来使代码中的区别清楚吗?

However, I have not found any example highlighting the difference. Can anyone provide a minimal example to make the difference clear in code?

推荐答案

Operator状态仅在仅适用于高级用户的Operator API中使用,并且不如最终用户API稳定,这就是为什么我们很少做广告的原因它.例如,考虑

Operator state is only used in Operator API which is intended only for power users and it's not as stable as the end-user APIs, which is why we rarely advertise it. As an example, consider AbstractUdfStreamOperator, which represents an operator with an UDF. For checkpointing, the state of the UDF needs to be saved and on recovery restored.

@Override
public void snapshotState(StateSnapshotContext context) throws Exception {
    super.snapshotState(context);
    StreamingFunctionUtils.snapshotFunctionState(context, getOperatorStateBackend(), userFunction);
}

@Override
public void initializeState(StateInitializationContext context) throws Exception {
    super.initializeState(context);
    StreamingFunctionUtils.restoreFunctionState(context, userFunction);
}

在这一点上,状态可以被序列化为一个字节blob.只要操作员可以自行恢复状态,状态就可以呈任意形状.

At this point, the state could be serialized as just a byte blob. As long as the operator can restore the state by itself, the state can take an arbitrary shape.

但是,巧合的是,在过去,许多操作员状态也已(重新)实现为受管状态.因此,实际上这条线更加模糊.

However, coincidentally in the past, much of the operator states have also been (re-)implemented as managed state. So the line is more blurry in reality.

这篇关于原始状态与托管状态的示例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆