云数据流中的SerializableFunction安全线程 [英] Thread safely of SerializableFunction in cloud dataflow

查看:116
本文介绍了云数据流中的SerializableFunction安全线程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在实现SerializableFunction接口,我想重用一些我在构造函数中创建的昂贵的辅助对象.在数据流作业中使用此类时,是否为使用该类的每个线程创建/克隆了一个新实例?

I'm implementing the SerializableFunction interface and I'd like to reuse some expensive helper objects that I create in the constructor. When this class is used in a dataflow job, is a new instance created/cloned for every thread that uses it?

谢谢, Genady

Thanks, Genady

推荐答案

简短回答
SerializableFunction不需要是线程安全的,因为每个线程都有自己的反序列化实例.它在共享范围内访问的所有引用(例如,通过静态方法/静态引用/...)都必须是线程安全的.

Short Answer
SerializableFunction does not need to be thread-safe since each thread gets its own deserialized instance. Any references which it accesses within a shared scope (e.g. via static methods/static references/...) need to be thread-safe.

长答案
SerializableFunction使用Java的对象序列化机制进行序列化,并保存为Dataflow规范的一部分.根据规范及其优化方式,SerializableFunction很可能会分解为多个工作单元.然后,每个工作机可以请求一个或多个并行处理的工作单元.每个工作单元将使用Java的对象序列化机制来重新创建SerializableFunction的实例.每个线程仅分配给一个工作单元.请注意,即使每个工作单元都分配给一个线程,但是如果昂贵的辅助对象不是SerializableFunction的一部分,而是通过另一种方法(例如,通过静态引用/方法)进行访问,则昂贵的辅助对象仍然可以在其中共享工作器上具有相同SerializableFunction的多个实例.

Long Answer
The SerializableFunction is serialized using Java's object serialization mechanism and saved as a part of the Dataflow specification. Depending on the specification and how it is optimized, the SerializableFunction will most likely be broken up into multiple units of work. Each worker machine may then request 1 or more units of work which they process in parallel. Each unit of work will use Java's object serialization mechanism to recreate an instance of the SerializableFunction. Each thread is assigned to only one unit of work. Note that even though each unit of work is assigned to one thread, if the expensive helper objects are not part of the SerializableFunction and instead accessed via another method such as through a static reference/method, then the expensive helper objects may still be shared amongst multiple instances of the same SerializableFunction on the worker.

这篇关于云数据流中的SerializableFunction安全线程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆