什么是星火序列化和Java序列之间的区别？ [英] What is the difference between Spark Serialization and Java Serialization?

查看：142 发布时间：2016/5/22 15:57:35 java serialization apache-spark

本文介绍了什么是星火序列化和Java序列之间的区别？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用的是星火+纱线和我有我想要的分布式节点上调用服务。

I'm using Spark + Yarn and I have a service that I want to call on distributed nodes.

当我使用Java序列化在JUnit测试序列化此服务对象手动，该服务内的所有藏品以及序列化和反序列化：

When I serialize this service object "by hand" in a Junit test using java serialization, all inner collections of the service are well serialized and deserialized :

  @Test
  public void testSerialization() {  

    try (
        ConfigurableApplicationContext contextBusiness = new ClassPathXmlApplicationContext("spring-context.xml");
        FileOutputStream fileOutputStream = new FileOutputStream("myService.ser");
        ObjectOutputStream objectOutputStream = new ObjectOutputStream(fileOutputStream);
        ) {

      final MyService service = (MyService) contextBusiness.getBean("myServiceImpl");

      objectOutputStream.writeObject(service);
      objectOutputStream.flush();

    } catch (final java.io.IOException e) {
      logger.error(e.getMessage(), e);
    }
  }

  @Test
  public void testDeSerialization() throws ClassNotFoundException {  

    try (
        FileInputStream fileInputStream = new FileInputStream("myService.ser");
        ObjectInputStream objectInputStream = new ObjectInputStream(fileInputStream);
        ) {

      final MyService myService = (MyService) objectInputStream.readObject();

      // HERE a functionnal test who proves the service has been fully serialized and deserialized      .

    } catch (final java.io.IOException e) {
      logger.error(e.getMessage(), e);
    }
  }

但是，当我试着通过星火发射器调用这个服务，羯羊我播服务对象与否，一些内在的集合（一个HashMap）消失（不是序列化）一样，如果它被标记为短暂（但它的不是暂时没有静态）：

But when I try to call this service via my Spark launcher, wether I broadcast the service object or not, some inner collection (a HashMap) disappears (is not serialized) like if it was tagged as "transient" (but it's not transient neither static) :

JavaRDD<InputOjbect> listeInputsRDD = sprkCtx.parallelize(listeInputs, 10);
JavaRDD<OutputObject> listeOutputsRDD = listeInputsRDD.map(new   Function<InputOjbect, OutputObject>() {
  private static final long serialVersionUID = 1L;

  public OutputObject call(InputOjbect input) throws TarificationXmlException { // Exception

    MyOutput output = service.evaluate(input);
    return (new OutputObject(output));
  }
});

同样的结果，如果我的广播服务：

same result if I broadcast the service :

final Broadcast<MyService> broadcastedService = sprkCtx.broadcast(service);      
JavaRDD<InputOjbect> listeInputsRDD = sprkCtx.parallelize(listeInputs, 10);
JavaRDD<OutputObject> listeOutputsRDD = listeInputsRDD.map(new   Function<InputOjbect, OutputObject>() {
  private static final long serialVersionUID = 1L;

  public OutputObject call(InputOjbect input) throws TarificationXmlException { // Exception

    MyOutput output = broadcastedService.getValue().evaluate(input);
    return (new OutputObject(output));
  }
});

如果我启动在本地模式下，而不是纱线群集模式同样星火code，它完美的作品。

If I launch this same Spark code in local mode instead of yarn cluster mode, it works perfectly.

所以我的问题是：星火序列化和Java序列之间的区别是什么？（我不使用KRYO或任何自定义序列化）。

So my question is : What is the difference between Spark Serialization and Java Serialization ? (I'm not using Kryo or any customized serialization).

编辑：当我尝试用KRYO串行器（没有明确注册任何类），我有同样的问题。

EDIT : when I try with Kryo serializer (without registering explicitly any class), I have the same problem.

什么是星火序列化和Java序列之间的区别？ [英] What is the difference between Spark Serialization and Java Serialization?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

什么是星火序列化和Java序列之间的区别？ [英] What is the difference between Spark Serialization and Java Serialization?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭