序列化RDD [英] Serializing RDD

查看：780 发布时间：2016/5/22 15:27:20 java apache-spark rdd

本文介绍了序列化RDD的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有我试图序列化，然后通过反序列化重构的RDD。我想看看这是否可能在Apache中的火花。

I have an RDD which I am trying to serialize and then reconstruct by deserializing. I am trying to see if this is possible in Apache Spark.

     static JavaSparkContext sc = new JavaSparkContext(conf);
        static SerializerInstance si = SparkEnv.get().closureSerializer().newInstance();
    static ClassTag<JavaRDD<String>> tag = scala.reflect.ClassTag$.MODULE$.apply(JavaRDD.class);
..
..
            JavaRDD<String> rdd = sc.textFile(logFile, 4);
            System.out.println("Element 1 " + rdd.first());
            ByteBuffer bb= si.serialize(rdd, tag);
            JavaRDD<String> rdd2 = si.deserialize(bb, Thread.currentThread().getContextClassLoader(),tag);
            System.out.println(rdd2.partitions().size());
            System.out.println("Element 0 " + rdd2.first());

我得到的最后一行例外，当我执行新创建RDD的动作。我序列化的方式类似于它是如何在星火内部完成。

I get an exception on the last line when I perform an action on the newly created RDD. The way I am serializing is similar to how it is done internally in Spark.

Exception in thread "main" org.apache.spark.SparkException: RDD transformations and actions can only be invoked by the driver, not inside of other transformations; for example, rdd1.map(x => rdd2.values.count() * x) is invalid because the values transformation and count action cannot be performed inside of the rdd1.map transformation. For more information, see SPARK-5063.
    at org.apache.spark.rdd.RDD.sc(RDD.scala:87)
    at org.apache.spark.rdd.RDD.take(RDD.scala:1177)
    at org.apache.spark.rdd.RDD.first(RDD.scala:1189)
    at org.apache.spark.api.java.JavaRDDLike$class.first(JavaRDDLike.scala:477)
    at org.apache.spark.api.java.JavaRDD.first(JavaRDD.scala:32)
    at SimpleApp.sparkSend(SimpleApp.java:63)
    at SimpleApp.main(SimpleApp.java:91)

创建的RDD和相同的进程中加载的，所以我不明白，这个错误是如何发生的。

The RDD is created and loaded within the same process, so I don't understand how this error happens.

序列化RDD [英] Serializing RDD

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

序列化RDD [英] Serializing RDD

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭