如何在Spark代码中设置Kryo的不可修改集合序列化程序 [英] How to set Unmodifiable collection serializer of Kryo in Spark code

查看:117
本文介绍了如何在Spark代码中设置Kryo的不可修改集合序列化程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Java的Spark(v1.6.1)中使用Kryo序列化,并且在序列化其字段中具有集合的类时,会引发以下错误-

I am using Kryo serialization in Spark (v1.6.1) in Java and while serializing a class which has a collection in its field, it throws the following error -

Caused by: java.lang.UnsupportedOperationException
         at java.util.Collections$UnmodifiableCollection.add(Collections.java:1055)
         at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:102)
         at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18)
         at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648)
         at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605)
         ... 27 more

我发现这是因为Kryo的默认CollectionSerializer无法反序列化集合,因为它不可修改,我们应该改用UnmodifiableCollectionsSerializer.

I found out that this is because the default CollectionSerializer of Kryo can not deserialize the collection, because its not modifiable and we should use UnmodifiableCollectionsSerializer instead.

我如何在Spark代码中特别提及为Kryo使用UnmodifiableCollectionsSerializer?

How do I mention specifically in spark code to use UnmodifiableCollectionsSerializer for Kryo?

我当前的配置是-

SparkConf conf = new SparkConf().setAppName("ABC");
conf.set("spark.serializer", "org.apache.spark.serializer.KryoSerializer");
conf.registerKryoClasses(new Class<?>[] {*list of classes I want to register*});

推荐答案

万一其他人遇到此问题,这里是解决方案-我通过使用javakaffee kryo序列化程序使其工作.

In case anybody else face this issue, here is the solution - I got it working by using javakaffee kryo serializers.

添加以下Maven依赖项:

Add the following maven dependency:

<dependency>
        <groupId>de.javakaffee</groupId>
        <artifactId>kryo-serializers</artifactId>
        <version>0.42</version>
</dependency>

写一个自定义的kryo注册器来注册UnmodifiableCollectionsSerializer

Write a custom kryo registrator to register UnmodifiableCollectionsSerializer

    public class CustomKryoRegistrator implements KryoRegistrator {
        @Override
        public void registerClasses(Kryo kryo) {        
             UnmodifiableCollectionsSerializer.registerSerializers(kryo);
        }
   }

将spark.kryo.registrator设置为自定义注册者的全名

Set spark.kryo.registrator to the custom registrator's fully-qualified name

conf.set("spark.kryo.registrator", "com.abc.CustomKryoRegistrator");

参考-

https://github.com/magro/kryo-serializers

Spark Kryo:注册自定义序列化器

这篇关于如何在Spark代码中设置Kryo的不可修改集合序列化程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆