Apache Flink:修改存储对象时,MapState是否会自动更新? [英] Apache Flink: Is MapState automatically updated when I modify a stored object?
问题描述
是否有必要使用 MapState.put()
手动更新状态,或者在修改对象时是否自动更新状态?
Is it necessary to use MapState.put()
to manually update the state or whether is the state automatically updated when I modify an object?
private transient MapState<String, Word> words;
.......
Word w = words.get(word);
if (w == null) {
w = new Word(word);
//words.put(word, w); //A
}
if (....) {
w.countBad(1); // countXXX modifies a the private variable in a Word object
} else {
w.countGood(1);
}
//words.put(word, w); //B
Q :如果我使用A方法,下一次计数计算是否会自动更新相应的 Mapstate
状态?还是在计算完成后需要使用B方法手动更新状态?
Q: If I use the A method, will the next count calculation automatically update the corresponding Mapstate
state? Or do I need to use the B method to manually update the state after the calculation is complete?
推荐答案
从API的角度来看,您始终需要手动更新状态.
From an API point of view, you always need to manually update the state.
但是,实际行为取决于状态后端.如果应用程序使用 InMemoryStateBackend
或 FsStateBackend
,则所有本地状态都存储在工作进程的JVM堆上,即状态后端仅保存对该对象的引用.因此,在修改对象时直接修改状态.
However, the actual behavior depends on the state backend. If the application uses the InMemoryStateBackend
or the FsStateBackend
, all local state is stored on the JVM heap of the worker process, i.e., the state backend just holds a reference to the object. Hence, the state is directly modified when you modify the object.
如果您使用 RocksDBStateBackend
,则所有状态访问都将取消/串行化,并从RocksDB中读取/写入.在这种情况下,修改对象不会影响状态.
If you use the RocksDBStateBackend
all state accesses are de/serialized and read from / written to RocksDB. In this case modifying the object does not have an effect on the state.
我建议始终显式更新状态,因为这将确保您可以在不调整应用程序逻辑的情况下切换状态后端.
I recommend to always explicitly update the state because this will ensure that you can switch the state backend without adjusting the logic of your application.
这篇关于Apache Flink:修改存储对象时,MapState是否会自动更新?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!