为什么GridSearchCV模型结果与我手动调整的模型不同? [英] Why GridSearchCV model results are different than the model I manually tuned?

查看:56
本文介绍了为什么GridSearchCV模型结果与我手动调整的模型不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我在这里的第一个问题,希望我做对了,

this is my first question ever here I hope I am doing this right,

我正在研究在kaggle上很流行的泰坦尼克号数据集,如果您想检查数据科学框架:实现99%的准确性

I was working on titanic dataset which is popular on kaggle, this tutarial if u wanna check A Data Science Framework: To Achieve 99% Accuracy

第5.2部分介绍了如何进行网格搜索和调整超参数.让我对您的问题特别了解之前,先与您分享相关代码;

the part 5.2, it teaches how to gridsearch and tune hyper-parameters. let me share related codes with you before I get spesific on my question;

这正在使用GridSearchCV调整模型:

this is tuning the model with GridSearchCV:

cv_split = model_selection.ShuffleSplit(n_splits = 10, test_size = .3, train_size = .6, random_state = 0)
#cv_split = model_selection.KFold(n_splits=10, shuffle=False, random_state=None)

param_grid = {'criterion': ['gini', 'entropy'],
'splitter': ['best', 'random'], #splitting methodology; two supported strategies - default is best
'max_depth': [2,4,6,8,10,None], #max depth tree can grow; default is none
'min_samples_split': [2,5,10,.03,.05], #minimum subset size BEFORE new split (fraction is % of total); default is 2
'min_samples_leaf': [1,5,10,.03,.05], #minimum subset size AFTER new split split (fraction is % of total); default is 1
'max_features': [None, 'auto'], #max features to consider when performing split; default none or all
'random_state': [0] }

tune_model = model_selection.GridSearchCV(tree.DecisionTreeClassifier(), param_grid=param_grid, scoring = 'roc_auc', return_train_score = True ,cv = cv_split)
tune_model.fit(data1[data1_x_bin], data1[Target])`

    tune_model.best_params_

result is: 

    {'criterion': 'gini',
     'max_depth': 4,
     'max_features': None,
     'min_samples_leaf': 5,
     'min_samples_split': 2,
     'random_state': 0,
     'splitter': 'best'}

并根据代码,培训和测试的准确性调整以下内容:

and acording to code, training and test accuracy sopposed to be like that when tuned with those:

print(tune_model.cv_results_['mean_train_score'][tune_model.best_index_], tune_model.cv_results_['mean_test_score'][tune_model.best_index_])

此输出: 0.8924916598172832 0.8767742588186237

出于好奇,我想使用从GridSearchCV获得的参数制作自己的DecisionTreeClassifier(),

out of curiousity, I wanted to make my own DecisionTreeClassifier() with parameters I got from GridSearchCV,

dtree = tree.DecisionTreeClassifier(criterion = 'gini',max_depth = 4,max_features= None, min_samples_leaf= 5, min_samples_split= 2,random_state = 0,  splitter ='best')

results = model_selection.cross_validate(dtree, data1[data1_x_bin],  data1[Target],return_train_score = True, cv  = cv_split)

相同的超参数,相同的交叉验证数据框,不同的结果.为什么?

Same hyperparameters, same cross validation dataframes, different results. Why?

print(results['train_score'].mean(), results['test_score'].mean())

0.8387640449438202 0.8227611940298509

那是tune_model结果:

that one was tune_model results:

0.8924916598172832 0.8767742588186237

差异甚至不小.如果您问我,两个结果应该相同

difference is not even small. Both results should be same if u ask me,

我不明白有什么不同?有什么不同,所以结果也不同?

I don'T understand what is different? what is different so results are different?

我尝试了用k折叠而不是随机拆分进行交叉验证,

I tried cross validating with k-fold instead of shufflesplit,

在两种情况下,我都尝试使用不同的random_state值,也尝试过random_state = None,

in both scenarios I tried with different random_state values, tried also random_state = None,

结果仍然不同.

有人可以解释一下区别吗?

can someone explain the difference please?

顺便说一句,我还想检查测试样本结果:

edit: btw, I also wanted to check test sample results:

dtree.fit(data1[data1_x_bin],data1[Target])
dtree.score(test1_x_bin,test1_y), tune_model.score(test1_x_bin,test1_y)

输出:(0.8295964125560538,0.9033059266872216)

相同的模型(决策树分类器),相同的超参数,结果截然不同

same models(decisiontreeclassifier), same hyper-parameters, very different results

(显然它们不是相同的模型,但我不知道如何以及为什么)

( obviously they are not same models but I can't see how and why )

推荐答案

更新

默认情况下, cross_validate 使用estimators评分方法作为默认值来评估其效果(您可以通过指定 cross validate 的 scoring kw参数来更改此效果.代码>). DecisionTreeClassifier 类的计分方法使用准确性作为其计分指标.在GridSearchCV中, roc_auc 用作得分指标.在两种情况下使用相同的分数度量会得出相同的分数.例如.如果 cross_validate ist的得分指标更改为 roc_auc ,则您在模型之间观察到的得分差异将消失.

By default cross_validate uses the estimators score method as default to evaluate its performance (you can change that by specifiying the scoring kw argument of cross validate). The score method of the DecisionTreeClassifier class uses accuracy as its score metric. Within the GridSearchCV roc_auc is used as the score metric. Using the same score metric in both cases results in identical scores. E.g. if the score metric of cross_validate ist changed to roc_aucthe score difference you observed between models vanishes.

results = model_selection.cross_validate(dtree, data1[data1_x_bin],  data1[Target], scoring = 'roc_auc' ... )

关于得分指标:

分数度量的选择决定了如何评估模型的性能.

The choice of the score metric determines how the performance of a model is evaluated.

想象一下,一个模型应该预测交通信号灯是否为绿色(交通信号灯为绿色-> 1,交通信号灯不是绿色-> 0).此模型可能会产生两种类型的错误.它表示交通信号灯是绿色,尽管它不是绿色(假阳性),或者说交通信号灯不是绿色,尽管它是绿色(假阴性).在这种情况下,假阴性将是丑陋的,但其后果是可以忍受的(有人必须在交通信号灯处等待的时间比必要的时间长).另一方面,误报将带来灾难性的后果(有人将交通灯通过红色,因为它被归类为绿色).为了评估模型的性能,将选择得分度量,该度量度量的假阳性比假阴性的权重更高(即,将它们归类为更差"的错误).准确性在这里将是不合适的指标,因为假阴性和假阳性会将得分降低的程度相同.例如, precision 更适合作为得分指标.该度量标准权衡误报为1的误报和误报为0的误报(误报的数量对模型的精度没有影响).要获得良好的概述,请在此处.F得分的beta参数(另一个得分指标)可用于设置与假阴性相比假阳性应如何加权(有关更详细的解释,请参见此处(它是根据混淆矩阵的不同统计数据计算得出的.)

Imagine a model should predict whether a traffic light is green (traffic light is green -> 1, traffic light is not green -> 0). This model can make two types of mistakes. Either it says the traffic light is green although it is not green (false positive) or it says the traffic light is not green although it is green (false negative). In this case, a false negative would be ugly, but bearable in its consequences (somebody has to wait longer at the traffic light than necessary). False positives, on the other hand, would be catastrophic (someone passes the traffic light red because it has been classified as green). In order to evaluate the model's performance, a score metric would be chosen which weighs false positives higher (i.e. classifies them as "worse" errors) than false negatives. Accuracy would be an unsuitable metric here, because false negatives and false positives would lower the score to the same extent. More suitable as a score metric would be, for example, precision. This metric weighs false positives with 1 and false negatives with 0 (the number of false negatives has no influence on the precision of a model). For a good overview what false negatives, false positives, precision, recall, accuracy etc. are see here. The beta parameter of the F score (another score metric) can be used to set how false positives should be weighted compared to false negatives (for a more detailed explanation, see here). More information about the roc_auc score can be found here (it is calculated from different statistics of the confusion matrix).

总而言之,这意味着相同的模型相对于一个得分指标可以很好地执行,而相对于另一个得分指标则不能很好地执行.在您描述的情况下,由GridSearchCV优化的决策树与之后实例化的树是相同的模型.两者都产生相同的准确性或相同的 roc_auc 分数.用于比较数据集上不同模型的性能的得分标准取决于您认为对模型性能特别重要的标准.如果唯一的标准是正确分类了多少个实例,那么准确性可能是一个不错的选择.

In summary, this means that the same model can perform very well in relation to one score metric, while it performs poorly in relation to another. In the case you described, the decision tree optimized by GridSearchCV and the tree you instantiated afterwards are identical models. Both yield identical accuracys or identical roc_auc scores. Which score metric you use to compare the performance of different models on your data set depends on which criteria you consider to be particularly important for model performance. If the only criterium is how many instances have been classified correctly, accuracy is a probably a good choice.

旧主意(查看评论):

您为 dtree ( dtree = tree.DecisionTreeClassifier(random_state = 0 ... ))指定了随机状态,但没有为GridSearchCV中使用的决策树指定随机状态.在那使用相同的随机状态,让我知道是否可以解决问题.

You specified a random state for dtree (dtree = tree.DecisionTreeClassifier(random_state = 0 ...) , but none for the decision tree used in the GridSearchCV. Use the same random state there and let me know if that solved the problem.

tune_model = model_selection.GridSearchCV(tree.DecisionTreeClassifier(random_state=0), ...)

这篇关于为什么GridSearchCV模型结果与我手动调整的模型不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆