ValueError:形状不匹配:如果类别是一个数组,它必须是形状 (n_features,) [英] ValueError: Shape mismatch: if categories is an array, it has to be of shape (n_features,)

查看:176
本文介绍了ValueError:形状不匹配:如果类别是一个数组,它必须是形状 (n_features,)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个简单的代码来实现 OneHotEncoder.

I have create a simple code to implement OneHotEncoder.

from sklearn.preprocessing import OneHotEncoder
X = [[0, 'a'], [0, 'b'], [1, 'a'], [2, 'b']]
onehotencoder = OneHotEncoder(categories=[0])
X = onehotencoder.fit_transform(X).toarray()

我只想使用名为 fit_transform 的方法到 X 索引 0,所以它意味着 [0, 0, 1, 2] 就像您在 X 中看到的一样.但它会导致这样的错误:

I just want to use method called fit_transform to the X for index 0, so it means for [0, 0, 1, 2] like what you see in X. But it causes an error like this :

ValueError: Shape mismatch: 如果类别是一个数组,它必须是形状 (n_features,).

谁能解决这个问题?我坚持了

Anyone can solve this problem ? I am stuck on it

推荐答案

您需要使用 ColumnTransformer 指定列索引而不是 categories 参数.

You need to use ColumnTransformer to specify the column index not categories parameter.

构造函数参数 categories 是明确告诉不同的类别值.例如.您可以明确提供 [0, 1, 2] ,但 auto 将确定它.此外,您可以使用 slice() 对象.

Constructor parameter categories is to tell distinct category values explicitly. E.g. you could provide [0, 1, 2] explicitly, but auto will determine it. Further, you can use slice() object instead.

from sklearn.preprocessing import OneHotEncoder
from sklearn.compose import ColumnTransformer

X = [[0, 'a'], [0, 'b'], [1, 'a'], [2, 'b']]

ct = ColumnTransformer(
    [('one_hot_encoder', OneHotEncoder(categories='auto'), [0])],   # The column numbers to be transformed (here is [0] but can be [0, 1, 3])
    remainder='passthrough'                                         # Leave the rest of the columns untouched
)

X = ct.fit_transform(X)

这篇关于ValueError:形状不匹配:如果类别是一个数组,它必须是形状 (n_features,)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆