ValueError:feature_names不匹配:在xgboost中的predict()函数中 [英] ValueError: feature_names mismatch: in xgboost in the predict() function
问题描述
我已经训练了XGBoostRegressor模型.当我必须使用经过训练的模型来预测新的输入时,尽管输入特征向量与训练数据具有相同的结构,但是predict()函数会引发feature_names不匹配错误.
I have trained an XGBoostRegressor model. When I have to use this trained model for predicting for a new input, the predict() function throws a feature_names mismatch error, although the input feature vector has the same structure as the training data.
此外,为了以与训练数据相同的结构构建特征向量,我进行了许多效率低下的处理,例如添加新的空列(如果不存在数据),然后重新排列数据列以使其与训练结构匹配.是否有更好,更干净的格式化输入格式,使其与训练结构匹配的方法?
Also, in order to build the feature vector in the same structure as the training data, I am doing a lot inefficient processing such as adding new empty columns (if data does not exist) and then rearranging the data columns so that it matches with the training structure. Is there a better and cleaner way of formatting the input so that it matches the training structure?
推荐答案
在这种情况下,模型构建时列名的顺序与模型评分时列名的顺序不同.
This is the case where the order of column-names while model building is different from order of column-names while model scoring.
我已按照以下步骤解决了该错误
I have used the following steps to overcome this error
首先加载泡菜文件
model = pickle.load(open("saved_model_file", "rb"))
按使用顺序排除所有列
cols_when_model_builds = model.get_booster().feature_names
重新排列熊猫数据框
pd_dataframe = pd_dataframe[cols_when_model_builds]
这篇关于ValueError:feature_names不匹配:在xgboost中的predict()函数中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!