lightgbm中的f1_score指标 [英] f1_score metric in lightgbm

查看:1999
本文介绍了lightgbm中的f1_score指标的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用自定义指标训练lgb模型:f1_score平均值为weighted

I want to train a lgb model with custom metric : f1_score with weighted average.

我通过查看了lightgbm的高级示例,并发现了自定义二进制错误函数的含义.我实现了类似的功能来返回f1_score,如下所示.

I went through the advanced examples of lightgbm over here and found the implimentation of custom binary error function. I implemented as similiar functon to return f1_score as shown below.

def f1_metric(preds, train_data):

    labels = train_data.get_label()

    return 'f1', f1_score(labels, preds, average='weighted'), True

我试图通过将feval参数传递为f1_metric来训练模型,如下所示.

I tried to train the model by passing feval parameter as f1_metric as shown below.

evals_results = {}

bst = lgb.train(params, 
                     dtrain, 
                     valid_sets= [dvalid], 
                     valid_names=['valid'], 
                     evals_result=evals_results, 
                     num_boost_round=num_boost_round,
                     early_stopping_rounds=early_stopping_rounds,
                     verbose_eval=25, 
                     feval=f1_metric)

然后我得到ValueError: Found input variables with inconsistent numbers of samples:

训练集将传递给函数,而不是验证集.

The training set is being passed to the function rather than the validation set.

如何配置以便通过验证集并返回f1_score.?

How can I configure such that the validation set is passed and f1_score is returned.?

推荐答案

文档有些混乱.在描述传递给feval的函数的签名时,它们将其参数称为 preds train_data ,这会产生误导.

The docs are a bit confusing. When describing the signature of the function that you pass to feval, they call its parameters preds and train_data, which is a bit misleading.

但是以下方法似乎可行:

But the following seems to work:

from sklearn.metrics import f1_score

def lgb_f1_score(y_hat, data):
    y_true = data.get_label()
    y_hat = np.round(y_hat) # scikits f1 doesn't like probabilities
    return 'f1', f1_score(y_true, y_hat), True

evals_result = {}

clf = lgb.train(param, train_data, valid_sets=[val_data, train_data], valid_names=['val', 'train'], feval=lgb_f1_score, evals_result=evals_result)

lgb.plot_metric(evals_result, metric='f1')

要使用多个自定义指标,请像上面定义一个整体的自定义指标功能,在其中您可以计算所有指标并返回元组列表.

To use more than one custom metric, define one overall custom metrics function just like above, in which you calculate all metrics and return a list of tuples.

固定代码,当然F1越大越好,应将其设置为True.

Fixed code, of course with F1 bigger is better should be set to True.

这篇关于lightgbm中的f1_score指标的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆