org.apache.spark.ml.classification和org.apache.spark.mllib.classification的区别 [英] Difference between org.apache.spark.ml.classification and org.apache.spark.mllib.classification

查看:844
本文介绍了org.apache.spark.ml.classification和org.apache.spark.mllib.classification的区别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在写一个应用程序的火花,并希望在MLlib使用的算法。在API文档,我发现两个不同的类为相同的算法。

I'm writing a spark application and would like to use algorithms in MLlib. In the API doc I found two different classes for the same algorithm.

例如,存在org.apache.spark.ml.classification也org.apache.spark.mllib.classification一个LogisticRegressionwithSGD 1逻辑回归

For example, there is one LogisticRegression in org.apache.spark.ml.classification also a LogisticRegressionwithSGD in org.apache.spark.mllib.classification.

我可以找到唯一的区别在于,一个在org.apache.spark.ml从估计继承,并能够在交叉验证中使用。我很困惑,他们被放置在不同的包。是否有任何人知道的原因吧?谢谢!

The only difference I can find is that the one in org.apache.spark.ml is inherited from Estimator and was able to be used in cross validation. I was quite confused that they are placed in different packages. Is there anyone know the reason for it? Thanks!

推荐答案

这是 JIRA票

和从设计文件

MLlib现占地面积机器学习算法,例如,Logistic回归,决策树一个基本的选择,交替最小二乘,和k-手段。目前的API集包含几个设计缺陷,prevent我们前进到
  解决实际的机器学习管道,
  让MLlib本身就是一个可扩展的项目。

MLlib now covers a basic selection of machine learning algorithms, e.g., logistic regression, decision trees, alternating least squares, and k-means. The current set of APIs contains several design flaws that prevent us moving forward to address practical machine learning pipelines, make MLlib itself a scalable project.

新的API将生活在 org.apache.spark.ml oasmllib 将德precated一旦我们的所有功能移植到 oasml

The new set of APIs will live under org.apache.spark.ml, and o.a.s.mllib will be deprecated once we migrate all features to o.a.s.ml.

这篇关于org.apache.spark.ml.classification和org.apache.spark.mllib.classification的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆