是否可以在spark-ml/spark-mllib中更新现有模型? [英] Whether we can update existing model in spark-ml/spark-mllib?
本文介绍了是否可以在spark-ml/spark-mllib中更新现有模型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我们正在使用spark-ml从现有数据构建模型.每天都会有新数据.
We are using spark-ml to build the model from existing data. New data comes on daily basis.
有没有一种方法,我们只能读取新数据并更新现有模型,而不必每次都读取所有数据并进行重新训练?
Is there a way that we can only read the new data and update the existing model without having to read all the data and retrain every time?
推荐答案
这取决于您所使用的模型,但对于某些Spark,您所做的正是您 StreamingKMeans , StreamingLogisticRegressionWithSGD 以及更广泛的
It depends on the model you're using but for some Spark does exactly what you want. You can look at StreamingKMeans, StreamingLinearRegressionWithSGD, StreamingLogisticRegressionWithSGD and more broadly StreamingLinearAlgorithm.
这篇关于是否可以在spark-ml/spark-mllib中更新现有模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文