模型的特征数量必须与输入匹配.模型n_features是40,输入n_features是38 [英] Number of features of the model must match the input. Model n_features is 40 and input n_features is 38

查看:109
本文介绍了模型的特征数量必须与输入匹配.模型n_features是40,输入n_features是38的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到此错误.请给我任何建议以解决它.这是我的代码.我正在从train.csv中获取训练数据并从另一个文件test.csv中测试数据.我是机器学习的新手,所以我无法理解是什么问题.请给我任何建议.

i am getting this error.please give me any suggestion to resolve it.here is my code.i am taking traing data from train.csv and testing data from another file test.csv.i am new to machine learning so i could not understand what is the problem.give me any suggestion.

import quandl,math    
import numpy as np    
import pandas as pd    
import matplotlib.pyplot as plt
from matplotlib import style
import datetime
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.feature_extraction.text import CountVectorizer
from sklearn import metrics
train = pd.read_csv("train.csv", index_col=None)
test = pd.read_csv("test.csv", index_col=None)
vectorizer = CountVectorizer(min_df=1)
X1 = vectorizer.fit_transform(train['question'])
Y1 = vectorizer.fit_transform(test['testing'])
X=X1.toarray()
Y=Y1.toarray()
#print(Y.shape)
number=LabelEncoder()
train['answer']=number.fit_transform(train['answer'].astype('str'))
features = ['question','answer']
y = train['answer']
clf=RandomForestClassifier(n_estimators=100)
clf.fit(X[:25],y)
predicted_result=clf.predict(Y[17])
p_result=number.inverse_transform(predicted_result)
f = open('output.txt', 'w')
t=str(p_result)
f.write(t)
print(p_result)

推荐答案

您的代码有多个问题.但是与此问题相关的是,您要在训练数据和测试数据上同时安装CountVectorizer( vectorizer ),这就是为什么要获得不同功能的原因.

There are multiple problems with your code. But the thing related to this question is that you are fitting the CountVectorizer (vectorizer) on both train and test data, which is why you are getting different features.

您应该做的是:

X1 = vectorizer.fit_transform(train['question'])

# The following line is changed
Y1 = vectorizer.transform(test['testing'])

这篇关于模型的特征数量必须与输入匹配.模型n_features是40,输入n_features是38的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆