使用插入符号包和R绘制学习曲线 [英] Plot learning curves with caret package and R

查看：199 发布时间：2020/5/4 9:00:39 r plot machine-learning supervised-learning

本文介绍了使用插入符号包和R绘制学习曲线的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想研究模型调整的偏差/方差之间的最佳权衡.我将插入符号用于R，这使我能够针对模型的超参数(mtry，lambda等)绘制性能指标(AUC，准确性...)，并自动选择最大值.通常这会返回一个好的模型，但是如果我想进一步挖掘并选择其他偏差/方差折衷方案，则需要学习曲线，而不是性能曲线.

I would like to study the optimal tradeoff between bias/variance for model tuning. I'm using caret for R which allows me to plot the performance metric (AUC, accuracy...) against the hyperparameters of the model (mtry, lambda, etc.) and automatically chooses the max. This typically returns a good model, but if I want to dig further and choose a different bias/variance tradeoff I need a learning curve, not a performance curve.

为简单起见，假设我的模型是一个随机森林，它只有一个超参数"mtry"

For the sake of simplicity, let's say my model is a random forest, which has just one hyperparameter 'mtry'

我想绘制训练和测试集的学习曲线.像这样:

I would like to plot the learning curves of both training and test sets. Something like this:

(红色曲线是测试集)

在y轴上，我放置了一个错误度量(错误分类的示例数或类似的数目)；在x轴上"mtry"或训练集大小.

On the y axis I put an error metric (number of misclassified examples or something like that); on the x axis 'mtry' or alternatively the training set size.

问题:

插入式功能是否具有基于大小不同的训练集折叠迭代训练模型的功能?如果必须手动编码，该怎么办?

Has caret the functionality to iteratively train models based of training set folds different in size? If I have to code by hand, how can I do that?

如果我想将超参数放在x轴上，则需要用caret :: train训练的所有模型，而不仅仅是最终模型(在CV之后获得最大性能的模型).这些丢弃的"模型在训练后仍然可用吗?

If I want to put the hyperparameter on the x axis, I need all the models trained by caret::train, not just the final model (the one with maximum performance got after CV). Are these "discarded" model still available after train?

使用插入符号包和R绘制学习曲线 [英] Plot learning curves with caret package and R

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

使用插入符号包和R绘制学习曲线 [英] Plot learning curves with caret package and R

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭