如何在Spark代码中设置Kryo的不可修改集合序列化程序 [英] How to set Unmodifiable collection serializer of Kryo in Spark code
问题描述
我在Java的Spark(v1.6.1)中使用Kryo序列化,并且在序列化其字段中具有集合的类时,会引发以下错误-
I am using Kryo serialization in Spark (v1.6.1) in Java and while serializing a class which has a collection in its field, it throws the following error -
Caused by: java.lang.UnsupportedOperationException
at java.util.Collections$UnmodifiableCollection.add(Collections.java:1055)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:102)
at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
... 27 more
我发现这是因为Kryo的默认CollectionSerializer无法反序列化集合,因为它不可修改,我们应该改用UnmodifiableCollectionsSerializer.
I found out that this is because the default CollectionSerializer of Kryo can not deserialize the collection, because its not modifiable and we should use UnmodifiableCollectionsSerializer instead.
我如何在Spark代码中特别提及为Kryo使用UnmodifiableCollectionsSerializer?
How do I mention specifically in spark code to use UnmodifiableCollectionsSerializer for Kryo?
我当前的配置是-
SparkConf conf = new SparkConf().setAppName("ABC");
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
conf.registerKryoClasses(new Class<?>[] {*list of classes I want to register*});
推荐答案
万一其他人遇到此问题,这里是解决方案-我通过使用javakaffee kryo序列化程序使其工作.
In case anybody else face this issue, here is the solution - I got it working by using javakaffee kryo serializers.
添加以下Maven依赖项:
Add the following maven dependency:
<dependency>
<groupId>de.javakaffee</groupId>
<artifactId>kryo-serializers</artifactId>
<version>0.42</version>
</dependency>
写一个自定义的kryo注册器来注册UnmodifiableCollectionsSerializer
Write a custom kryo registrator to register UnmodifiableCollectionsSerializer
public class CustomKryoRegistrator implements KryoRegistrator {
@Override
public void registerClasses(Kryo kryo) {
UnmodifiableCollectionsSerializer.registerSerializers(kryo);
}
}
将spark.kryo.registrator设置为自定义注册者的全名
Set spark.kryo.registrator to the custom registrator's fully-qualified name
conf.set("spark.kryo.registrator", "com.abc.CustomKryoRegistrator");
参考-
https://github.com/magro/kryo-serializers
这篇关于如何在Spark代码中设置Kryo的不可修改集合序列化程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!