使用Scikit学习时发生ValueError.模型的特征数量与输入的数量不匹配 [英] ValueError while using Scikit learn. Number of features of model don't match that of input

查看:2345
本文介绍了使用Scikit学习时发生ValueError.模型的特征数量与输入的数量不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用RandomForestClassifier解决分类问题.在代码中,我将数据集拆分为训练和测试数据以进行预测.

I am working on a classification problem using RandomForestClassifier. In the code I'm splitting the dataset into a train and test data for making predictions.

代码如下:

from sklearn.ensemble import RandomForestClassifier
from sklearn.cross_validation import train_test_split
import numpy as np
from numpy import genfromtxt, savetxt

a = (np.genfromtxt(open('filepath.csv','r'), delimiter=',', dtype='int')[1:])
a_train, a_test = train_test_split(a, test_size=0.33, random_state=0)


def main():
    target = [x[0] for x in a_train]
    train = [x[1:] for x in a_train]

    rf = RandomForestClassifier(n_estimators=100)
    rf.fit(train, target)
    predicted_probs = [[index + 1, x[1]] for index, x in enumerate(rf.predict_proba(a_test))]

    savetxt('filepath.csv', predicted_probs, delimiter=',', fmt='%d,%f', 
            header='Id,PredictedProbability', comments = '')

if __name__=="__main__":
    main()

但是,执行时,出现以下错误:

On exection however, I'm getting the following error:

ValueError:模型的特征数量必须与输入匹配. 型号n_features为1434,输入n_features为1435

ValueError: Number of features of the model must match the input. Model n_features is 1434 and input n_features is 1435

关于我应该如何进行的任何建议?谢谢.

Any suggestions as to how I should proceed? Thanks.

推荐答案

您似乎正在直接使用a_test,而没有去除输出功能.

It looks like you are using a_test directly, without stripping out the output feature.

该模型很困惑,因为它只需要1434个输入要素,但是您将1434个要素与输出要素一起提供.

The model is confused because it expects only 1434 input features but you are feeding it 1434 features along with the output feature.

您可以通过对test进行与火车相同的操作来解决此问题.

You can fix this by doing the same thing with test that you did with train.

test = [x[1:] for x in a_test]

然后在以下行中使用test:

predicted_probs = [[index + 1, x[1]] for index, x in enumerate(rf.predict_proba(test))]

这篇关于使用Scikit学习时发生ValueError.模型的特征数量与输入的数量不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆