多类-多标签分类的精度/召回率 [英] Precision/recall for multiclass-multilabel classification

查看:39
本文介绍了多类-多标签分类的精度/召回率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道如何计算多类多标签分类的准确率和召回率,即有两个以上标签的分类,并且每个实例可以有多个标签?

解决方案

对于多标签分类,您有两种方法首先考虑以下内容.

  • 是示例的数量.
  • 是 示例..
  • 是 示例.
  • 是,预测正确的比例.分子找出预测向量中有多少标签与地面实况相同,比率计算出有多少预测的真实标签实际上在地面实况中.
  • 召回 = , 预测的实际标签数量的比率.分子找出预测向量中有多少标签与基本事实相同(如上),然后找出与实际标签数量的比率,从而得到预测的实际标签的比例.

还有其他指标.

基于标签

这里的事情是按标签完成的.对于每个标签,计算指标(例如精度、召回率),然后汇总这些标签指标.因此,在这种情况下,您最终要计算整个数据集上每个标签的精度/召回率,就像二进制分类一样(因为每个标签都有一个二进制分配),然后对其进行聚合.

最简单的方法是呈现一般形式.

这只是标准多类等价物的扩展.

  • 宏观平均

  • 微平均

这里是 分别是真阳性、假阳性、真阴性和假阴性计数分别为 标签.

这里的 $B$ 代表任何基于混淆矩阵的指标.在您的情况下,您将插入标准精度和召回率公式.对于宏观平均值,您传入每个标签的计数然后求和,对于微观平均值,您首先对计数求平均值,然后应用您的度量函数.

您可能有兴趣查看多标签指标的代码 此处 ,它是 mldr<包的一部分/a> 在 R 中.您也可能有兴趣查看 Java 多标签库 MULAN.

这是一篇介绍不同指标的好论文:多标签学习算法综述

I'm wondering how to calculate precision and recall measures for multiclass multilabel classification, i.e. classification where there are more than two labels, and where each instance can have multiple labels?

解决方案

For multi-label classification you have two ways to go First consider the following.

  • is the number of examples.
  • is the ground truth label assignment of the example..
  • is the example.
  • is the predicted labels for the example.

Example based

The metrics are computed in a per datapoint manner. For each predicted label its only its score is computed, and then these scores are aggregated over all the datapoints.

  • Precision = , The ratio of how much of the predicted is correct. The numerator finds how many labels in the predicted vector has common with the ground truth, and the ratio computes, how many of the predicted true labels are actually in the ground truth.
  • Recall = , The ratio of how many of the actual labels were predicted. The numerator finds how many labels in the predicted vector has common with the ground truth (as above), then finds the ratio to the number of actual labels, therefore getting what fraction of the actual labels were predicted.

There are other metrics as well.

Label based

Here the things are done labels-wise. For each label the metrics (eg. precision, recall) are computed and then these label-wise metrics are aggregated. Hence, in this case you end up computing the precision/recall for each label over the entire dataset, as you do for a binary classification (as each label has a binary assignment), then aggregate it.

The easy way is to present the general form.

This is just an extension of the standard multi-class equivalent.

  • Macro averaged

  • Micro averaged

Here the are the true positive, false positive, true negative and false negative counts respectively for only the label.

Here $B$ stands for any of the confusion-matrix based metric. In your case you would plug in the standard precision and recall formulas. For macro average you pass in the per label count and then sum, for micro average you average the counts first, then apply your metric function.

You might be interested to have a look into the code for the mult-label metrics here , which a part of the package mldr in R. Also you might be interested to look into the Java multi-label library MULAN.

This is a nice paper to get into the different metrics: A Review on Multi-Label Learning Algorithms

这篇关于多类-多标签分类的精度/召回率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆