修改似乎无处不在的警告 [英] Modifying warnings that seems to come from nowhere

查看:41
本文介绍了修改似乎无处不在的警告的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我分叉了一个名为 rasa_nlu 的存储库来处理我想要修改的部分代码:里面有一个函数 component.train(...)文件中的函数 train(...) model.py 似乎在没有提供来源的情况下触发警告,我想找到触发它的原因.

基本上它将此函数应用于组件列表:

[<rasa_nlu.utils.spacy_utils.SpacyNLP 对象在 0x7f3abbfbd780>、<rasa_nlu.tokenizers.spacy_tokenizer.SpacyTokenizer 对象在 0x7f3abbfbd710>、<.spaaturd780<spaaturd780<spaaturd780<spataturd780<.;在0x7f3abbd1a630>中< rasa_nlu.featurizers.regex_featurizer.RegexFeaturizer对象; rasa_nlu.extractors.crf_entity_extractor.CRFEntityExtractor对象在0x7f3abbd1a748>中< rasa_nlu.extractors.entity_synonyms.EntitySynonymMapper对象在0x7f3abbd1a3c8>中< rasa_nlu.classifiers.sklearn_intent_classifier.SklearnIntentClassifier对象在 0x7f3abbd1a240>]

似乎最后一个触发了警告.

我试图修改 train()"nofollow noreferrer">components.py 存储库文件,它没有改变任何东西,所以我怀疑它不是正确的.

无论如何这里是model.py文件中的代码train(...):

<代码>...导入 rasa_nlu从 rasa_nlu 导入组件、工具、配置从 rasa_nlu.components 导入组件,ComponentBuilder从 rasa_nlu.config 导入 RasaNLUModelConfig, override_defaults从 rasa_nlu.persistor 导入 Persistor从 rasa_nlu.training_data 导入 TrainingData, Message从 rasa_nlu.utils 导入 create_dir, write_json_to_file...类教练(对象):"""Trainer 将加载数据并训练所有组件.需要管道规范和配置才能用于培训."""# 官方支持的语言(可能会使用其他语言,但可能会失败)SUPPORTED_LANGUAGES = ["de", "en"]def __init__(self,cfg, # 类型:RasaNLUModelConfigcomponent_builder=None, # 类型:可选[ComponentBuilder]skip_validation=False # 类型:bool):# 类型:(...) ->没有任何self.config = cfgself.skip_validation = skip_validationself.training_data = None # 类型:可选[TrainingData]如果 component_builder 为 None:# 如果没有通过构建器,每次创建解释器都会导致# 一个新的构建器.因此,没有组件被重用.component_builder = components.ComponentBuilder()# 在实例化组件类之前,让我们检查是否所有# 需要的包可用如果不是 self.skip_validation:components.validate_requirements(cfg.component_names)# 构建管道self.pipeline = self._build_pipeline(cfg, component_builder)...def train(self, data, **kwargs):# 类型:(TrainingData) ->口译员"""使用提供的训练数据训练底层管道."""self.training_data = 数据context = kwargs # 类型:Dict[Text, Any]对于 self.pipeline 中的组件:更新 = component.provide_context()如果更新:上下文.更新(更新)# 训练开始前:检查是否提供了所有参数如果不是 self.skip_validation:components.validate_arguments(self.pipeline, context)# 数据在训练期间被内部修改 - 因此复制working_data = copy.deepcopy(data)对于我,枚举(self.pipeline)中的组件:logger.info("开始训练组件{}""".format(component.name))component.prepare_partial_processing(self.pipeline[:i], context)打印(火车前")更新 = component.train(working_data, self.config,**语境)logger.info("完成训练组件.")打印(更新前")如果更新:上下文.更新(更新)返回解释器(self.pipeline,上下文)

输出是

火车前更新前火车前更新前火车前更新前火车前更新前火车前更新前火车前更新前火车前对 6 个候选者中的每一个进行 2 次拟合,总共 12 次拟合/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score 定义不明确,在标签中设置为 0.0没有预测样本.'精度'、'预测'、平均值、warn_for)/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score 定义不明确,在标签中设置为 0.0没有预测样本.'精度'、'预测'、平均值、warn_for)/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score 定义不明确,在标签中设置为 0.0没有预测样本.'精度'、'预测'、平均值、warn_for)/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score 定义不明确,在标签中设置为 0.0没有预测样本.'精度'、'预测'、平均值、warn_for)/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score 定义不明确,在标签中设置为 0.0没有预测样本.'精度'、'预测'、平均值、warn_for)/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score 定义不明确,在标签中设置为 0.0没有预测样本.'精度'、'预测'、平均值、warn_for)[Parallel(n_jobs=1)]:完成 12 个,共 12 个 |已用时间:0.1s 完成更新前教练.坚持:

您可以在此处看到我想要捕获和修改以了解来源的警告 UndefinedMetricWarning:F-score 定义错误,在没有预测样本的标签中设置为 0.0.>

所以,你能看出这个警告是从哪里来的吗?什么需要 sklearn/metrics/classification.py?

解决方案

这是 Rasa NLU 存储库中记录的问题.我建议关注这些问题或在那里添加您的评论以供解决.一个被标记为需要帮助,这意味着他们正在寻找社区贡献来解决这个问题.

关于为什么从上面链接的第一个问题中出现警告的 tl:dr:

<块引用>

所以警告只是一个警告.它表明一个/某些意图的训练示例太少.添加更多示例将解决此问题(这就是添加重复项会消除此警告的原因,但实际上您应该添加不同的示例).

如果您希望警告消失,请添加更多训练数据.使用evaluation.py 脚本查找缺少的意图.

从警告消息中您可以看到它是从 sklearn/metrics/classification.py 生成的,也就是这个文件 这里.

I forked a repository named rasa_nlu to work on a part of the code I want to modify : there is a function component.train(...) inside of a function train(...) in a file model.py which seems to trigger warnings without providing the origin and I want to find what trigger it.

Basically it applies this function to a list of components:

[<rasa_nlu.utils.spacy_utils.SpacyNLP object at 0x7f3abbfbd780>, <rasa_nlu.tokenizers.spacy_tokenizer.SpacyTokenizer object at 0x7f3abbfbd710>, <rasa_nlu.featurizers.spacy_featurizer.SpacyFeaturizer object at 0x7f3abbfbd748>, <rasa_nlu.featurizers.regex_featurizer.RegexFeaturizer object at 0x7f3abbd1a630>, <rasa_nlu.extractors.crf_entity_extractor.CRFEntityExtractor object at 0x7f3abbd1a748>, <rasa_nlu.extractors.entity_synonyms.EntitySynonymMapper object at 0x7f3abbd1a3c8>, <rasa_nlu.classifiers.sklearn_intent_classifier.SklearnIntentClassifier object at 0x7f3abbd1a240>]

And it seems that the last one triggers the warnings.

I tried to modify the function train() in the components.py file of the repository and it didn't changed anything so I suspect it is not the right one.

Anyway here is the code train(...) in a file model.py:

...

import rasa_nlu
from rasa_nlu import components, utils, config
from rasa_nlu.components import Component, ComponentBuilder
from rasa_nlu.config import RasaNLUModelConfig, override_defaults
from rasa_nlu.persistor import Persistor
from rasa_nlu.training_data import TrainingData, Message
from rasa_nlu.utils import create_dir, write_json_to_file

...

class Trainer(object):
    """Trainer will load the data and train all components.

    Requires a pipeline specification and configuration to use for
    the training."""

    # Officially supported languages (others might be used, but might fail)
    SUPPORTED_LANGUAGES = ["de", "en"]

    def __init__(self,
                 cfg,  # type: RasaNLUModelConfig
                 component_builder=None,  # type: Optional[ComponentBuilder]
                 skip_validation=False  # type: bool
                 ):
        # type: (...) -> None

        self.config = cfg
        self.skip_validation = skip_validation
        self.training_data = None  # type: Optional[TrainingData]

        if component_builder is None:
            # If no builder is passed, every interpreter creation will result in
            # a new builder. hence, no components are reused.
            component_builder = components.ComponentBuilder()

        # Before instantiating the component classes, lets check if all
        # required packages are available
        if not self.skip_validation:
            components.validate_requirements(cfg.component_names)

        # build pipeline
        self.pipeline = self._build_pipeline(cfg, component_builder)

    ...

    def train(self, data, **kwargs):
        # type: (TrainingData) -> Interpreter
        """Trains the underlying pipeline using the provided training data."""
        self.training_data = data

        context = kwargs  # type: Dict[Text, Any]

        for component in self.pipeline:
            updates = component.provide_context()
            if updates:
                context.update(updates)

        # Before the training starts: check that all arguments are provided
        if not self.skip_validation:
            components.validate_arguments(self.pipeline, context)

        # data gets modified internally during the training - hence the copy
        working_data = copy.deepcopy(data)
        for i, component in enumerate(self.pipeline):
            logger.info("Starting to train component {}"
                        "".format(component.name))
            component.prepare_partial_processing(self.pipeline[:i], context)
            print("before train")
            updates = component.train(working_data, self.config,
                                      **context)
            logger.info("Finished training component.")
            print("before updates")
            if updates:
                context.update(updates)
        return Interpreter(self.pipeline, context)

And the output is

before train
before updates
before train
before updates
before train
before updates
before train
before updates
before train
before updates
before train
before updates
before train
Fitting 2 folds for each of 6 candidates, totalling 12 fits
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
/home/mike/Programming/Rasa/myflaskapp/rasaenv/lib/python3.5/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.
  'precision', 'predicted', average, warn_for)
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.1s finished
before updates
trainer.persist:

You can see here the warnings which I want to catch and modify to know the origin UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 in labels with no predicted samples.

Therefore, can you see where does this warnings come from? What calls for sklearn/metrics/classification.py?

解决方案

This is a documented issue on the Rasa NLU repository. I would recommend following these issues or adding your comments there for resolution. One is marked as help wanted, meaning they are looking for a community contribution to address it.

The tl:dr on why the warning occurs from the first issue linked above:

so the warning is just a warning. It indicates that there are too few training examples for one / some of the intents. Adding more examples will fix this (thats why adding duplicates will remove this warning, but really you should be adding different examples).

If you want the warning to go away add more training data. Use the evaluation.py script to find the intents that are lacking.

From the warning message you can see it is produced from sklearn/metrics/classification.py which is this file here.

这篇关于修改似乎无处不在的警告的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆