sklearn Kfold进行单折而不是for循环 [英] sklearn Kfold acces single fold instead of for loop

查看:106
本文介绍了sklearn Kfold进行单折而不是for循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用cross_validation.KFold(n,n_folds = folds)后,我想访问索引以训练和测试单个折叠,而不是遍历所有折叠。

After using cross_validation.KFold(n, n_folds=folds) I would like to access the indexes for training and testing of single fold, instead of going through all the folds.

所以让我们以示例代码为例:

So let's take the example code:

from sklearn import cross_validation
X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
y = np.array([1, 2, 3, 4])
kf = cross_validation.KFold(4, n_folds=2)

>>> print(kf)  
sklearn.cross_validation.KFold(n=4, n_folds=2, shuffle=False,
                           random_state=None)
>>> for train_index, test_index in kf:

我想像这样访问kf中的第一折(而不是for循环) :

I would like to access the first fold in kf like this (instead of for loop):

train_index, test_index in kf[0]

这应该只返回第一折,但是我收到错误消息: TypeError:'KFold'对象不支持索引

This should return just the first fold, but instead I get the error: "TypeError: 'KFold' object does not support indexing"

我想要作为输出:

>>> train_index, test_index in kf[0]
>>> print("TRAIN:", train_index, "TEST:", test_index)
TRAIN: [2 3] TEST: [0 1]

链接: http: //scikit-learn.org/stable/modules/generation/sklearn.cross_validation.KFold.html

怎么办我只检索火车的索引并进行一次折叠测试,而无需遍历整个for循环?

How do I retrieve the indexes for train and test for only a single fold, without going through the whole for loop?

推荐答案

正确的轨道。现在您需要做的就是:

You are on the right track. All you need to do now is:

kf = cross_validation.KFold(4, n_folds=2)
mylist = list(kf)
train, test = mylist[0]

kf 实际上是一个生成器,它直到需要时才计算火车测试拆分。由于您不存储不需要的项目,因此可以提高内存使用率。列出 KFold 对象的列表会强制其使所有值可用。

kf is actually a generator, which doesn't compute the train-test split until it is needed. This improves memory usage, as you are not storing items you don't need. Making a list of the KFold object forces it to make all values available.

这里有两个很棒的问题解释什么是生成器:一个两个

Here are two great SO question that explain what generators are: one and two

编辑2018年11月

自sklearn 0.20起,API发生了变化。一个更新的示例(对于py3.6):

The API has changed since sklearn 0.20. An updated example (for py3.6):

from sklearn.model_selection import KFold
import numpy as np

kf = KFold(n_splits=4)

X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])


X_train, X_test = next(kf.split(X))

In [12]: X_train
Out[12]: array([2, 3])

In [13]: X_test
Out[13]: array([0, 1])

这篇关于sklearn Kfold进行单折而不是for循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆