ValueError:预期的n_neighbors< = 1.得到5 -Scikit K最近的分类器 [英] ValueError: Expected n_neighbors <= 1. Got 5 -Scikit K Nearest Classifier

查看：179 发布时间：2020/5/4 9:45:11 python numpy pandas machine-learning scikit-learn

本文介绍了ValueError:预期的n_neighbors< = 1.得到5 -Scikit K最近的分类器的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用SCIkit KNN和levenstein距离对字符串进行一些处理，就像此页底部的示例一样: http://scikit-learn.org/stable/faq.html .区别在于我的数据被分为训练集并位于一个数据帧中.

I'm using SCIkit KNN and levenstein distance to some work on strings, much like this example at the bottom of this page: http://scikit-learn.org/stable/faq.html . The difference being my data is split into training sets and is in a dataframe.

此处列出了拆分:

train_feature, test_feature, train_class, test_class = train_test_split(features, classes,
                                                    test_size=TEST_SET_SIZE, train_size=TRAINING_SET_SIZE,
                                                    random_state=42)

我有以下内容:

>>> model = KNeighborsClassifier(metric='pyfunc',func=machine_learning.custom_distance)
>>> model.fit(train_feature['id'], train_class.as_matrix(['gender']))
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='pyfunc',
       metric_params={'func': <function custom_distance at 0x7fd0236267b8>},
       n_neighbors=5, p=2, weights='uniform')

train_features有一个列([24000行x 1列])，id和train_class(名称:性别，dtype:对象)是带有性别"的系列，即"M"或"F".该ID对应于其他位置的字典中的键.

Where train_features has one column ([24000 rows x 1 columns]), id and train_class (Name: gender, dtype: object) is a series with "gender" which is 'M' or 'F'. The id corresponds to a key in a dict elsewhere.

自定义距离功能是:

def custom_distance(x,y):
i, j = int(x[0]), int(y[0])
return damerau_levenshtein_distance(lookup_dict[i],lookup_dict[j])

当我尝试获得模型的准确性时:

When I try to get the accuracy of the model:

 accuracy = model.score(test_feature, test_class)

我收到此错误:

 ValueError: Expected n_neighbors <= 1. Got 5

老实说，我真的很困惑.我检查了每个数据集的长度，它们很好.为什么会告诉我只有一个数据点可以绘制?任何帮助将不胜感激.

I'm honestly really confused. I've checked the length of each of my datasets and they are fine. Why would it be telling me I only have one data point to plot from? Any help would be greatly appreciated.

ValueError:预期的n_neighbors< = 1.得到5 -Scikit K最近的分类器 [英] ValueError: Expected n_neighbors <= 1. Got 5 -Scikit K Nearest Classifier

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

ValueError:预期的n_neighbors&lt; = 1.得到5 -Scikit K最近的分类器 [英] ValueError: Expected n_neighbors &lt;= 1. Got 5 -Scikit K Nearest Classifier

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

ValueError:预期的n_neighbors< = 1.得到5 -Scikit K最近的分类器 [英] ValueError: Expected n_neighbors <= 1. Got 5 -Scikit K Nearest Classifier

登录关闭