StratifiedShuffleSplit:ValueError:y中人口最少的类只有1个成员，这太少了. [英] StratifiedShuffleSplit: ValueError: The least populated class in y has only 1 member, which is too few.

查看：113 发布时间：2020/5/4 10:29:01 python-3.x machine-learning scikit-learn

本文介绍了StratifiedShuffleSplit:ValueError:y中人口最少的类只有1个成员，这太少了.的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用StratifiedShuffleSplit交叉验证器来预测波士顿数据集中的房价.当我运行以下示例代码时.

I'm using the StratifiedShuffleSplit cross validator for predicting the house prices in the Boston dataset. When I run the below sample code.

def fit_model_S(labels, features,step, clf,parameters):
  cv = StratifiedShuffleSplit(n_splits=2,test_size=0.10, random_state = 42)
  print (cv)
  for train_index, test_index in cv.split(features,labels):
    labels_train, labels_test = labels[train_index], labels[test_index]
    features_train, features_test = features[train_index], features[test_index]

我收到以下错误.该代码与ShuffleSplit一起使用.这意味着StratifiedShuffleSplit不能与数字标签一起使用.

I get the below error. The code works with ShuffleSplit.Does this mean that StratifiedShuffleSplit cannot be used with numeric labels.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-141-b290147edcbf> in <module>()
     33 dt_steps = [('decision', clf)]
     34 
---> 35 fit_model_S(labels, features,dt_steps,clf,parameters4)  
     36 
     37 

<ipython-input-141-b290147edcbf> in fit_model_S(labels, features, step, clf, parameters)
      8     cv = StratifiedShuffleSplit(n_splits=2,test_size=0.10, random_state = 42)
      9     print (cv)
---> 10     for train_index, test_index in cv.split(features,labels):
     11 
     12         labels_train, labels_test = labels[train_index], labels[test_index]

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\model_selection\_split.py in split(self, X, y, groups)
   1194         """
   1195         X, y, groups = indexable(X, y, groups)
-> 1196         for train, test in self._iter_indices(X, y, groups):
   1197             yield train, test
   1198 

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\model_selection\_split.py in _iter_indices(self, X, y, groups)
   1535         class_counts = np.bincount(y_indices)
   1536         if np.min(class_counts) < 2:
-> 1537             raise ValueError("The least populated class in y has only 1"
   1538                              " member, which is too few. The minimum"
   1539                              " number of groups for any class cannot"

ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.

数据集示例如下.

      RM  LSTAT  PTRATIO      MEDV
0  6.575   4.98     15.3  504000.0
1  6.421   9.14     17.8  453600.0
2  7.185   4.03     17.8  728700.0
3  6.998   2.94     18.7  701400.0
4  7.147   5.33     18.7  760200.0

在这种情况下，MEDV是标签.

The MEDV is the label in this case.

StratifiedShuffleSplit:ValueError:y中人口最少的类只有1个成员，这太少了. [英] StratifiedShuffleSplit: ValueError: The least populated class in y has only 1 member, which is too few.

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

StratifiedShuffleSplit:ValueError:y中人口最少的类只有1个成员，这太少了. [英] StratifiedShuffleSplit: ValueError: The least populated class in y has only 1 member, which is too few.

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭