ValueError：拆分数不能大于样本数n_splits = 3：1 [英] ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1

查看：613 发布时间：2020/10/11 19:53:22 python scikit-learn cross-validation sklearn-pandas

本文介绍了ValueError：拆分数不能大于样本数n_splits = 3：1的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试使用train_test_split和决策树回归器进行这种训练建模：

I am trying this training modeling using train_test_split and a decision tree regressor:

import sklearn
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from sklearn.model_selection import cross_val_score

# TODO: Make a copy of the DataFrame, using the 'drop' function to drop the given feature
new_data = samples.drop('Fresh', 1)

# TODO: Split the data into training and testing sets using the given feature as the target
X_train, X_test, y_train, y_test = train_test_split(new_data, samples['Fresh'], test_size=0.25, random_state=0)

# TODO: Create a decision tree regressor and fit it to the training set
regressor = DecisionTreeRegressor(random_state=0)
regressor = regressor.fit(X_train, y_train)

# TODO: Report the score of the prediction using the testing set
score = cross_val_score(regressor, X_test, y_test, cv=3)

print score

运行此命令时，出现错误：

When running this, I am getting the error:

ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1.

如果将cv的值更改为1，我将得到：

If I change the value of cv to 1, I get:

ValueError: k-fold cross-validation requires at least one train/test split by setting n_splits=2 or more, got n_splits=1.

数据的某些示例行如下：

Some sample rows of the data look like:

    Fresh   Milk    Grocery Frozen  Detergents_Paper    Delicatessen
0   14755   899 1382    1765    56  749
1   1838    6380    2824    1218    1216    295
2   22096   3575    7041    11422   343 2564

推荐答案

如果拆分数大于样本，您将得到第一个错误。检查源代码如下所示：

If the number of splits is greater than number of samples, you will get the first error. Check the snippet from the source code given below:

if self.n_splits > n_samples:
    raise ValueError(
        ("Cannot have number of splits n_splits={0} greater"
         " than the number of samples: {1}.").format(self.n_splits,
                                                     n_samples))

如果折叠数小于或等于 1 ，您将收到第二个错误。在您的情况下， cv = 1 。检查源代码：

If the number of folds is less than or equal 1, you will get the second error. In your case, the cv = 1. Check the source code:

if n_folds <= 1:
            raise ValueError(
                "k-fold cross validation requires at least one"
                " train / test split by setting n_folds=2 or more,"
                " got n_folds={0}.".format(n_folds))

一个有根据的猜测， X_test 中的样本数量少于 3 。仔细检查。

An educated guess, the number of samples in X_test is less than 3. Check that carefully.

这篇关于ValueError：拆分数不能大于样本数n_splits = 3：1的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

ValueError：拆分数不能大于样本数n_splits = 3：1 [英] ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

ValueError：拆分数不能大于样本数n_splits = 3：1 [英] ValueError: Cannot have number of splits n_splits=3 greater than the number of samples: 1

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭