未知标签类型:“连续" [英] Unknown label type: 'continuous'

查看:113
本文介绍了未知标签类型:“连续"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的同伴, 遇到问题
----------------------

My fellow Team, Having an issue
----------------------

   Avg.SessionLength TimeonApp  TimeonWebsite LengthofMembership Yearly Amount Spent
    0   34.497268   12.655651    39.577668     4.082621                 587.951054
    1   31.926272   11.109461    37.268959     2.664034                 392.204933
    2   33.000915   11.330278    37.110597     4.104543                 487.547505
    3   34.305557   13.717514    36.721283     3.120179                 581.852344
    4   33.330673   12.795189    37.536653     4.446308                 599.406092
    5   33.871038   12.026925    34.476878     5.493507                 637.102448
    6   32.021596   11.366348    36.683776     4.685017                 521.572175 

想应用KNN

X = df[['Avg. Session Length', 'Time on App','Time on Website', 'Length of Membership']] 
y = df['Yearly Amount Spent'] 

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, 
random_state=42) 

from sklearn.neighbors import KNeighborsClassifier 
knn = KNeighborsClassifier(n_neighbors=1)
knn.fit(X_train,y_train)

ValueError:未知标签类型:'continuous'

推荐答案

Yearly Amount Spent列中的值是实数,因此它们不能用作分类问题的标签(请参见

The values in Yearly Amount Spent column are real numbers, so they cannot serve as labels for a classification problem (see here):

在scikit-learn中进行分类时,y是整数的向量 或字符串.

When doing classification in scikit-learn, y is a vector of integers or strings.

因此,您会收到错误消息.如果要构建分类模型,则需要决定如何将它们转换为有限的一组标签.

Hence you get the error. If you want to build a classification model, you need to decide how you transform them into a finite set of labels.

请注意,如果您只是想避免错误,可以这样做

Note that if you just want to avoid the error, you could do

import numpy as np
y = np.asarray(df['Yearly Amount Spent'], dtype="|S6")

这会将y中的值转换为所需格式的字符串.但是,每个标签只会出现在一个样本中,因此您无法真正使用这样的标签集构建有意义的模型.

This will transform the values in y into strings of the required format. Yet, every label will appear in only one sample, so you cannot really build a meaningful model with such set of labels.

这篇关于未知标签类型:“连续"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆