UndefinedMetricWarning:F 分数定义不明确,在没有预测样本的标签中设置为 0.0 [英] UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples

查看:161
本文介绍了UndefinedMetricWarning:F 分数定义不明确,在没有预测样本的标签中设置为 0.0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我收到这个奇怪的错误:

classification.py:1113: UndefinedMetricWarning: F-score 定义不明确,在没有预测样本的标签中设置为 0.0.'精度'、'预测'、平均值、warn_for)`

但它也会在我第一次运行时打印 f-score:

metrics.f1_score(y_test, y_pred, average='weighted')

我第二次运行时,它提供了没有错误的分数.这是为什么?

<预><代码>>>>y_pred = test.predict(X_test)>>>y_test数组([ 1, 10, 35, 9, 7, 29, 26, 3, 8, 23, 39, 11, 20, 2, 5, 23, 28,30, 32, 18, 5, 34, 4, 25, 12, 24, 13, 21, 38, 19, 33, 33, 16, 20,18, 27, 39, 20, 37, 17, 31, 29, 36, 7, 6, 24, 37, 22, 30, 0, 22,11, 35, 30, 31, 14, 32, 21, 34, 38, 5, 11, 10, 6, 1, 14, 12, 36,25, 8, 30, 3, 12, 7, 4, 10, 15, 12, 34, 25, 26, 29, 14, 37, 23,12, 19, 19, 3, 2, 31, 30, 11, 2, 24, 19, 27, 22, 13, 6, 18, 20,6, 34, 33, 2, 37, 17, 30, 24, 2, 36, 9, 36, 19, 33, 35, 0, 4,1])>>>y_pred数组([ 1, 10, 35, 7, 7, 29, 26, 3, 8, 23, 39, 11, 20, 4, 5, 23, 28,30, 32, 18, 5, 39, 4, 25, 0, 24, 13, 21, 38, 19, 33, 33, 16, 20,18, 27, 39, 20, 37, 17, 31, 29, 36, 7, 6, 24, 37, 22, 30, 0, 22,11, 35, 30, 31, 14, 32, 21, 34, 38, 5, 11, 10, 6, 1, 14, 30, 36,25, 8, 30, 3, 12, 7, 4, 10, 15, 12, 4, 22, 26, 29, 14, 37, 23,12, 19, 19, 3, 25, 31, 30, 11, 25, 24, 19, 27, 22, 13, 6, 18, 20,6, 39, 33, 9, 37, 17, 30, 24, 9, 36, 39, 36, 19, 33, 35, 0, 4,1])>>>metrics.f1_score(y_test, y_pred, average='weighted')C:\Users\Michael\Miniconda3\envs\snowflakes\lib\site-packages\sklearn\metrics\classification.py:1113: UndefinedMetricWarning: F-score 定义不明确,在没有预测样本的标签中设置为 0.0.'精度'、'预测'、平均值、warn_for)0.87282051282051276>>>metrics.f1_score(y_test, y_pred, average='weighted')0.87282051282051276>>>metrics.f1_score(y_test, y_pred, average='weighted')0.87282051282051276

另外,为什么会出现 'precision', 'predicted', average, warn_for) 错误消息?没有左括号,为什么它以右括号结尾?我在 Windows 10 的 conda 环境中使用 Python 3.6.0 运行 sklearn 0.18.1.

我也看了这里,不知道是不是一样漏洞.这个 SO 帖子 也没有解决方案.

解决方案

正如评论中提到的,y_test 中的一些标签没有出现在 y_pred 中.特别是在这种情况下,永远不会预测标签2":

<预><代码>>>>设置(y_test) - 设置(y_pred){2}

这意味着没有要计算此标签的 F-score,因此该案例的 F-score 被认为是 0.0.由于您请求了分数的平均值,因此您必须考虑到计算中包含了 0 分,这就是 scikit-learn 向您显示该警告的原因.

这让我不会第二次看到错误.正如我所提到的,这是一个警告,它的处理方式与 Python 中的错误不同.大多数环境中的默认行为是只显示一次特定警告.可以更改此行为:

导入警告warnings.filterwarnings('always') # "error", "ignore", "always", "default", "module";或一次"

如果你在导入其他模块之前设置了这个,你每次运行代码都会看到警告.

除了设置 warnings.filterwarnings('ignore') 之外,没有办法避免第一次看到这个警告.您可以做的是决定您对未预测的标签分数不感兴趣,然后明确指定您感兴趣的标签(这些标签是标签至少被预测过一次):

<预><代码>>>>metrics.f1_score(y_test, y_pred, average='weighted', labels=np.unique(y_pred))0.91076923076923078

在这种情况下不显示警告.

I'm getting this weird error:

classification.py:1113: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)`

but then it also prints the f-score the first time I run:

metrics.f1_score(y_test, y_pred, average='weighted')

The second time I run, it provides the score without error. Why is that?

>>> y_pred = test.predict(X_test)
>>> y_test
array([ 1, 10, 35,  9,  7, 29, 26,  3,  8, 23, 39, 11, 20,  2,  5, 23, 28,
       30, 32, 18,  5, 34,  4, 25, 12, 24, 13, 21, 38, 19, 33, 33, 16, 20,
       18, 27, 39, 20, 37, 17, 31, 29, 36,  7,  6, 24, 37, 22, 30,  0, 22,
       11, 35, 30, 31, 14, 32, 21, 34, 38,  5, 11, 10,  6,  1, 14, 12, 36,
       25,  8, 30,  3, 12,  7,  4, 10, 15, 12, 34, 25, 26, 29, 14, 37, 23,
       12, 19, 19,  3,  2, 31, 30, 11,  2, 24, 19, 27, 22, 13,  6, 18, 20,
        6, 34, 33,  2, 37, 17, 30, 24,  2, 36,  9, 36, 19, 33, 35,  0,  4,
        1])
>>> y_pred
array([ 1, 10, 35,  7,  7, 29, 26,  3,  8, 23, 39, 11, 20,  4,  5, 23, 28,
       30, 32, 18,  5, 39,  4, 25,  0, 24, 13, 21, 38, 19, 33, 33, 16, 20,
       18, 27, 39, 20, 37, 17, 31, 29, 36,  7,  6, 24, 37, 22, 30,  0, 22,
       11, 35, 30, 31, 14, 32, 21, 34, 38,  5, 11, 10,  6,  1, 14, 30, 36,
       25,  8, 30,  3, 12,  7,  4, 10, 15, 12,  4, 22, 26, 29, 14, 37, 23,
       12, 19, 19,  3, 25, 31, 30, 11, 25, 24, 19, 27, 22, 13,  6, 18, 20,
        6, 39, 33,  9, 37, 17, 30, 24,  9, 36, 39, 36, 19, 33, 35,  0,  4,
        1])
>>> metrics.f1_score(y_test, y_pred, average='weighted')
C:\Users\Michael\Miniconda3\envs\snowflakes\lib\site-packages\sklearn\metrics\classification.py:1113: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
0.87282051282051276
>>> metrics.f1_score(y_test, y_pred, average='weighted')
0.87282051282051276
>>> metrics.f1_score(y_test, y_pred, average='weighted')
0.87282051282051276

Also, why is there a trailing 'precision', 'predicted', average, warn_for) error message? There is no open parenthesis so why does it end with a closing parenthesis? I am running sklearn 0.18.1 using Python 3.6.0 in a conda environment on Windows 10.

I also looked at here and I don't know if it's the same bug. This SO post doesn't have solution either.

解决方案

As mentioned in the comments, some labels in y_test don't appear in y_pred. Specifically in this case, label '2' is never predicted:

>>> set(y_test) - set(y_pred)
{2}

This means that there is no F-score to calculate for this label, and thus the F-score for this case is considered to be 0.0. Since you requested an average of the score, you must take into account that a score of 0 was included in the calculation, and this is why scikit-learn is showing you that warning.

This brings me to you not seeing the error a second time. As I mentioned, this is a warning, which is treated differently from an error in python. The default behavior in most environments is to show a specific warning only once. This behavior can be changed:

import warnings
warnings.filterwarnings('always')  # "error", "ignore", "always", "default", "module" or "once"

If you set this before importing the other modules, you will see the warning every time you run the code.

There is no way to avoid seeing this warning the first time, aside for setting warnings.filterwarnings('ignore'). What you can do, is decide that you are not interested in the scores of labels that were not predicted, and then explicitly specify the labels you are interested in (which are labels that were predicted at least once):

>>> metrics.f1_score(y_test, y_pred, average='weighted', labels=np.unique(y_pred))
0.91076923076923078

The warning is not shown in this case.

这篇关于UndefinedMetricWarning:F 分数定义不明确,在没有预测样本的标签中设置为 0.0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆