scikit-learn 中的分层训练/测试拆分 [英] Stratified Train/Test-split in scikit-learn

查看：73 发布时间：2021/6/25 19:38:32 python scikit-learn

本文介绍了scikit-learn 中的分层训练/测试拆分的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要将我的数据分成训练集 (75%) 和测试集 (25%).我目前使用以下代码执行此操作:

I need to split my data into a training set (75%) and test set (25%). I currently do that with the code below:

X, Xt, userInfo, userInfo_train = sklearn.cross_validation.train_test_split(X, userInfo)

但是，我想对我的训练数据集进行分层.我怎么做?我一直在研究 StratifiedKFold 方法，但没有让我指定 75%/25% 的分割并且只对训练数据集进行分层.

However, I'd like to stratify my training dataset. How do I do that? I've been looking into the StratifiedKFold method, but doesn't let me specifiy the 75%/25% split and only stratify the training dataset.

推荐答案

[update for 0.17]

查看sklearn.model_selection.train_test_split:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                    stratify=y, 
                                                    test_size=0.25)

[/更新为 0.17]

[/update for 0.17]

这里有一个拉取请求.但是你可以简单地做 train, test = next(iter(StratifiedKFold(...)))并根据需要使用训练和测试索引.

There is a pull request here. But you can simply do train, test = next(iter(StratifiedKFold(...))) and use the train and test indices if you want.

这篇关于scikit-learn 中的分层训练/测试拆分的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

scikit-learn 中的分层训练/测试拆分 [英] Stratified Train/Test-split in scikit-learn

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

scikit-learn 中的分层训练/测试拆分 [英] Stratified Train/Test-split in scikit-learn

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭