转换为DMatrix后XGBoost训练和测试功能的差异 [英] XGBoost difference in train and test features after converting to DMatrix

查看：538 发布时间：2020/5/4 9:11:58 python python-2.7 numpy machine-learning xgboost

本文介绍了转换为DMatrix后XGBoost训练和测试功能的差异的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

只想知道下一个情况怎么可能:

Just wondering how is possible next case:

 def fit(self, train, target):
     xgtrain = xgb.DMatrix(train, label=target, missing=np.nan)
     self.model = xgb.train(self.params, xgtrain, self.num_rounds)

I passed the train dataset as csr_matrix with 5233 columns, and after converting to DMatrix I got 5322 features.

后来地预测步骤，收到错误如上错误原因:(

Later on predict step, I got an error as cause of above bug :(

 def predict(self, test):
     if not self.model:
         return -1
     xgtest = xgb.DMatrix(test)
     return self.model.predict(xgtest)

错误:...训练数据没有以下字段:f5232

Error: ... training data did not have the following fields: f5232

如何保证将我的 train/test 数据集正确转换为DMatrix?

How can I guarantee correct converting my train/test datasets to DMatrix?

是否有任何机会，以在类似至R Python的东西用?

Are there any chance to use in Python something similar to R?

# get same columns for test/train sparse matrixes
col_order <- intersect(colnames(X_train_sparse), colnames(X_test_sparse))
X_train_sparse <- X_train_sparse[,col_order]
X_test_sparse <- X_test_sparse[,col_order]

我的方法是行不通的，不幸的是:

My approach doesn't work, unfortunately:

def _normalize_columns(self):
    columns = (set(self.xgtest.feature_names) - set(self.xgtrain.feature_names)) | \
          (set(self.xgtrain.feature_names) - set(self.xgtest.feature_names))
    for item in columns:
        if item in self.xgtest.feature_names:
            self.xgtest.feature_names.remove(item)
        else:
            # seems, it's immutable structure and can not add any new item!!!
            self.xgtest.feature_names.append(item)

转换为DMatrix后XGBoost训练和测试功能的差异 [英] XGBoost difference in train and test features after converting to DMatrix

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

转换为DMatrix后XGBoost训练和测试功能的差异 [英] XGBoost difference in train and test features after converting to DMatrix

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭