Spark决策树中使用了哪种算法(ID3,C4.5或CART) [英] What algorithm is used in spark decision tree (is ID3, C4.5 or CART)

查看:397
本文介绍了Spark决策树中使用了哪种算法(ID3,C4.5或CART)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对MLlib中的决策树有疑问. Spark使用什么算法?是ID3,C4.5还是CART?

I have a question about decision tree in MLlib. What algorithm is used in Spark? Is it ID3, C4.5 or CART?

推荐答案

Spark MLlib在CART中使用ID3算法.

Spark MLlib is using the ID3 algorithm with CART.

ID3仅处理分类变量,而CART可以处理连续变量. Spark决策树可以处理分类变量,因此它使用的是CART(在下面指定的Jira凭单中,我们可以看到它们尚未实现C4.5).

ID3 only handles categorical variables and CART can handle continuous variables. Spark decision trees can handle categorical variables, so it is using CART (in the Jira ticket specified below we can see that they haven't implemented C4.5 yet).

在此

In this blog post you can find some information about the different algorithms and it is where I got the answer from.

您可以在 Jira票证中找到有关将其扩展到C4.5的讨论./a>.

You can find a discussion on extending it to C4.5 in this Jira ticket.

有关算法之间差异的更多信息

More information about the difference between the algorithms here.

这篇关于Spark决策树中使用了哪种算法(ID3,C4.5或CART)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆