是否可以在Spark批次上创建模型并将其用于Spark流中? [英] Can a model be created on Spark batch and use it in Spark streaming?

查看:92
本文介绍了是否可以在Spark批次上创建模型并将其用于Spark流中?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以在Spark批处理中创建模型并在Spark Streaming上使用它进行实时处理吗?

Can I create a model in spark batch and use it on Spark streaming for real-time processing?

我在Apache Spark网站上看到了各种示例,其中,训练和预测都建立在相同类型的处理(线性回归)上.

I have seen the various examples on Apache Spark site where both training and prediction are built on the same type of processing (linear regression).

推荐答案

这是我刚刚实现的另一种解决方案.

here is one more solution which I just implemented.

我在spark-Batch中创建了一个模型. 假设最终的模型对象名称是regmodel.

I created a model in spark-Batch. suppose the final model object name is regmodel.

final LinearRegressionModel regmodel =algorithm.run(JavaRDD.toRDD(parsedData));

并且spark上下文名称为sc

and spark context name is sc as

JavaSparkContext sc = new JavaSparkContext(sparkConf);

现在,使用相同的代码,我将使用相同的sc创建火花流

Now in a same code I am creating a spark streaming using the same sc

final JavaStreamingContext jssc = new JavaStreamingContext(sc,new Duration(Integer.parseInt(conf.getWindow().trim())));

并进行如下预测:

JavaPairDStream<Double, Double> predictvalue = dist1.mapToPair(new PairFunction<LabeledPoint, Double,Double>() {
                private static final long serialVersionUID = 1L;
                @Override
                public Tuple2<Double, Double> call(LabeledPoint v1) throws Exception {
                    Double p = v1.label();
                    Double q = regmodel.predict(v1.features());
                    return new Tuple2<Double, Double>(p,q);
                }
            });

这篇关于是否可以在Spark批次上创建模型并将其用于Spark流中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆