XgBoost脚本无法正确输出二进制文件 [英] XgBoost Script is not outputing binary properly

查看：136 发布时间：2020/5/4 10:24:14 python machine-learning xgboost

本文介绍了XgBoost脚本无法正确输出二进制文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在学习使用xgboost，并且已经阅读了文档！但是，我不明白为什么我的脚本的输出在0~~2之间出现. 首先，我认为它应该为0或1，因为它是binary分类，但是随后，我读到它的出现概率为0或1，但是，有些输出是1.5+(至少在CSV上)，这对我来说毫无意义！

I'm learning to use xgboost, and I have read through the documentation! However, I'm not understanding why the output of my script is coming out between 0~~2. First, I thought it should come as either 0 or 1, since its a binary classification, but then, I read it comes as a probability of 0 or 1, however, some outputs are 1.5+ ( at least on the CSV ), which doesnt make sense to me!

我不确定问题是在xgboost参数上还是在csv创建中！这行np.expm1(preds)，我不确定它应该是np.expm1，但是我不知道该怎么做！

I'm unsure if the problem is on xgboost parameters or in the csv creation! This line, np.expm1(preds) , im not sure it should be np.expm1, but I dont know for what I could change it!

总而言之，我的问题是:

In conclusion, my question is :

为什么输出不是0或1，而是输出为0.0xxx和1.xxx?

这是我的剧本:

import numpy as np
import xgboost as xgb
import pandas as pd

train = pd.read_csv('../dataset/train.csv')
train = train.drop('ID', axis=1)

y = train['TARGET']

train = train.drop('TARGET', axis=1)
x = train

dtrain = xgb.DMatrix(x.as_matrix(), label=y.tolist())

test = pd.read_csv('../dataset/test.csv')

test = test.drop('ID', axis=1)
dtest = xgb.DMatrix(test.as_matrix())


# XGBoost params:
def get_params():
    #
    params = {}
    params["objective"] = "binary:logistic"
    params["booster"] = "gbtree"
    params["eval_metric"] = "auc"
    params["eta"] = 0.3  #
    params["subsample"] = 0.50
    params["colsample_bytree"] = 1.0
    params["max_depth"] = 20
    params["nthread"] = 4
    plst = list(params.items())
    #
    return plst


bst = xgb.train(get_params(), dtrain, 1000)

preds = bst.predict(dtest)

print np.max(preds)
print np.min(preds)
print np.average(preds)

# Make Submission
test_aux = pd.read_csv('../dataset/test.csv')
result = pd.DataFrame({"Id": test_aux["ID"], 'TARGET': np.expm1(preds)})

result.to_csv("xgboost_submission.csv", index=False)

XgBoost脚本无法正确输出二进制文件 [英] XgBoost Script is not outputing binary properly

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

XgBoost脚本无法正确输出二进制文件 [英] XgBoost Script is not outputing binary properly

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭