ValueError: 分割数不能大于样本数 n_splits=3: 1 [英] ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1

查看：21 发布时间：2021/12/25 14:45:17 python scikit-learn cross-validation sklearn-pandas

本文介绍了ValueError: 分割数不能大于样本数 n_splits=3: 1的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用 train_test_split 和决策树回归器进行这种训练建模:

I am trying this training modeling using train_test_split and a decision tree regressor:

import sklearn
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import cross_val_score

# TODO: Make a copy of the DataFrame, using the 'drop' function to drop the given feature
new_data = samples.drop('Fresh', 1)

# TODO: Split the data into training and testing sets using the given feature as the target
X_train, X_test, y_train, y_test = train_test_split(new_data, samples['Fresh'], test_size=0.25, random_state=0)

# TODO: Create a decision tree regressor and fit it to the training set
regressor = DecisionTreeRegressor(random_state=0)
regressor = regressor.fit(X_train, y_train)

# TODO: Report the score of the prediction using the testing set
score = cross_val_score(regressor, X_test, y_test, cv=3)

print score

运行时出现错误:

ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1.

如果我将 cv 的值更改为 1，我得到:

If I change the value of cv to 1, I get:

ValueError: k-fold cross-validation requires at least one train/test split by setting n_splits=2 or more, got n_splits=1.

数据的一些示例行如下所示:

Some sample rows of the data look like:

    Fresh   Milk    Grocery Frozen  Detergents_Paper    Delicatessen
0   14755   899 1382    1765    56  749
1   1838    6380    2824    1218    1216    295
2   22096   3575    7041    11422   343 2564

推荐答案

如果分割数大于样本数，你会得到第一个错误.检查源代码中的片段给出如下:

If the number of splits is greater than number of samples, you will get the first error. Check the snippet from the source code given below:

if self.n_splits > n_samples:
    raise ValueError(
        ("Cannot have number of splits n_splits={0} greater"
         " than the number of samples: {1}.").format(self.n_splits,
                                                     n_samples))

如果折叠次数小于或等于1，你会得到第二个错误.在您的情况下，cv = 1.检查源代码:

If the number of folds is less than or equal 1, you will get the second error. In your case, the cv = 1. Check the source code:

if n_folds <= 1:
            raise ValueError(
                "k-fold cross validation requires at least one"
                " train / test split by setting n_folds=2 or more,"
                " got n_folds={0}.".format(n_folds))

有根据的猜测，X_test 中的样本数小于 3.仔细检查一下.

An educated guess, the number of samples in X_test is less than 3. Check that carefully.

这篇关于ValueError: 分割数不能大于样本数 n_splits=3: 1的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

ValueError: 分割数不能大于样本数 n_splits=3: 1 [英] ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

ValueError: 分割数不能大于样本数 n_splits=3: 1 [英] ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭