org.apache.spark.ml.classification和org.apache.spark.mllib.classification的区别 [英] Difference between org.apache.spark.ml.classification and org.apache.spark.mllib.classification
问题描述
我正在写一个应用程序的火花,并希望在MLlib使用的算法。在API文档,我发现两个不同的类为相同的算法。
I'm writing a spark application and would like to use algorithms in MLlib. In the API doc I found two different classes for the same algorithm.
例如,存在org.apache.spark.ml.classification也org.apache.spark.mllib.classification一个LogisticRegressionwithSGD 1逻辑回归
For example, there is one LogisticRegression in org.apache.spark.ml.classification also a LogisticRegressionwithSGD in org.apache.spark.mllib.classification.
我可以找到唯一的区别在于,一个在org.apache.spark.ml从估计继承,并能够在交叉验证中使用。我很困惑,他们被放置在不同的包。是否有任何人知道的原因吧?谢谢!
The only difference I can find is that the one in org.apache.spark.ml is inherited from Estimator and was able to be used in cross validation. I was quite confused that they are placed in different packages. Is there anyone know the reason for it? Thanks!
推荐答案
这是 JIRA票
和从设计文件:
MLlib现占地面积机器学习算法,例如,Logistic回归,决策树一个基本的选择,交替最小二乘,和k-手段。目前的API集包含几个设计缺陷,prevent我们前进到
解决实际的机器学习管道,
让MLlib本身就是一个可扩展的项目。
MLlib now covers a basic selection of machine learning algorithms, e.g., logistic regression, decision trees, alternating least squares, and k-means. The current set of APIs contains several design flaws that prevent us moving forward to address practical machine learning pipelines, make MLlib itself a scalable project.
新的API将生活在 org.apache.spark.ml
和 oasmllib
将德precated一旦我们的所有功能移植到 oasml
。
The new set of APIs will live under org.apache.spark.ml
, and o.a.s.mllib
will be deprecated once we migrate all features to o.a.s.ml
.
这篇关于org.apache.spark.ml.classification和org.apache.spark.mllib.classification的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!