如何在决策树中指定拆分? [英] How can I specify splits in decision tree?

查看:82
本文介绍了如何在决策树中指定拆分?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试训练决策树分类器,以使用scikit-learn提供的功能评估棒球运动员.但是,我想根据专家对我的想法的了解,事先预先指定"或强迫"某些分歧(无论如何,都需要合并).例如,我要基于> 0.300的击球平均值来强制进行拆分.

I am trying to train a decision tree classifier for evaluating baseball players using scikit-learn's provided function. However, I would like to "pre-specify" or "force" some splits ahead of time, based on what I know to be true about the way experts think (these need to be incorporated regardless). For example, I want to force a split based on batting average > .300.

一个相关的问题是-我可以预先加载"先前训练过的决策树模型,而仅在随后的训练中更新"它吗?还是decisio树分类器每次运行时都需要重新学习所有规则?我想在这里做出的类比是转移学习,但将其应用到决策树中.

A related question is --can I "pre-load" a previously trained decision tree model and merely "update" it in a subsequent training? Or does the decisio tree classifier need to re-learn all the rules each time I run it? The analogy I'm trying to make here is to transfer learning, but applying it decision trees.

推荐答案

我预先指定拆分的方法是创建多个树.将玩家分为两组,平均组> 0.3和<= 0.3的组,然后在每个组上创建并测试一棵树.在得分过程中,一个简单的if-then-else可以将玩家发送到tree1或tree2.

The way that I pre-specify splits is to create multiple trees. Separate players into 2 groups, those with avg > 0.3 and <= 0.3, then create and test a tree on each group. During scoring, a simple if-then-else can send the players to tree1 or tree2.

这种方式的优点是您的代码非常明确.这也是测试这些专家规则的好方法-构建没有规则的单棵树,然后构建2棵树并进行比较.

The advantage of this way is your code is very explicit. It is also a good way to test these experts rules - build a single tree without the rule, then build 2 trees and compare.

缺点是,如果您有很多规则,那么对于许多树来说,这将变得非常繁重,要维护很多,如果不是那么,就可能会训练每棵树的小样本.但是也许所有专家规则都不是最优的.

The disadvantage is if you have many rules this becomes quite burdensome with many trees, many if-then-else to maintain and maybe small samples training each tree. But maybe all of the experts rules are not optimal.

这篇关于如何在决策树中指定拆分?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆