我应该将变量保留为瞬态吗? [英] Should I leave the variable as transient?

查看：85 发布时间：2020/9/4 6:32:02 java serialization apache-spark transient

本文介绍了我应该将变量保留为瞬态吗?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在尝试使用Apache Spark尝试解决诸如top-k，天际线等问题.

I have been experimenting with Apache Spark trying to solve some queries like top-k, skyline etc.

我制作了一个包装器，其中封装了名为SparkContext的SparkConf和JavaSparkContext.该类还实现了可序列化，但是由于SparkConf和JavaSparkContext不可序列化，因此该类也不是.

I have made a wrapper which encloses SparkConf and JavaSparkContext named SparkContext. This class also implements serializable but since SparkConf and JavaSparkContext are not serializable then the class isn't either.

我有一个解决名为TopK的topK查询的类，该类实现可序列化，但该类还具有一个不可序列化的SparkContext成员变量(由于上述原因).因此，每当我尝试从RDD中的.reduce()函数中执行TopK方法时，都会出现异常.

I have a class solving the topK query named TopK, the class implements serializable but the class also has a SparkContext member variable which is not serializable (for the reason above). Therefore I am getting an exception whenever I try to execute a TopK method from within a .reduce() function in an RDD.

我发现的解决方案是使SparkContext瞬态.

The solution I have found is to make SparkContext transient.

我的问题是:我应该将SparkContext变量保持为瞬态，还是犯了大错误?

My question is: Should I keep the SparkContext variable as transient or am I doing a big mistake?

SparkContext类:

SparkContext class:

import java.io.Serializable;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.*;

public class SparkContext implements Serializable {

    private final SparkConf sparConf; // this is not serializable
    private final JavaSparkContext sparkContext; // this is not either

    protected SparkContext(String appName, String master) {
        this.sparConf = new SparkConf();
        this.sparConf.setAppName(appName);
        this.sparConf.setMaster(master);

        this.sparkContext = new JavaSparkContext(sparConf);
    }

    protected JavaRDD<String> textFile(String path) {
        return sparkContext.textFile(path);
    }

}

TopK类:

TopK class:

public class TopK implements QueryCalculator, Serializable {

    private final transient SparkContext sparkContext;
    .
    .
    .
}

抛出Task not serializable异常的示例. getBiggestPointByXDimension甚至都不会输入，因为要使其在reduce函数中执行，包围它的类(TopK)必须是可序列化.

Example that throws Task not serializable exception. getBiggestPointByXDimension won't even get entered because in order for it to be executed in a reduce function the class enclosing it (TopK) must be serializable.

private Point findMedianPoint(JavaRDD<Point> points) {
    Point biggestPointByXDimension = points.reduce((a, b) -> getBiggestPointByXDimension(a, b));
    .
    .
    .
}

private Point getBiggestPointByXDimension(Point first, Point second) {
        return first.getX() > second.getX() ? first : second;
    }

我应该将变量保留为瞬态吗? [英] Should I leave the variable as transient?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

我应该将变量保留为瞬态吗? [英] Should I leave the variable as transient?

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭