Keras自定义指标总和是错误的 [英] Keras custom metric sum is wrong

查看:49
本文介绍了Keras自定义指标总和是错误的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试将precisionrecall作为自定义指标实施​​,如解决方案

老实说,我当时遇到了同样的问题,对我来说,最好的解决方案是使用内置的RecallPrecision指标.

从TensorFlow 2.0开始,这两个指标是内置的tensorflow.keras.metrics,并且如果您在最后一层将binary_crossentropyDense(1)一起使用,则它们可以很好地工作(最后它们是对当然).

主要要注意的是,实现与您尝试实现的功能以及Keras之前的功能完全不同.

实际上,在Keras 1.X版本中,所有这些度量均可用(F1-Score,Recall和Precision),但由于分批估算不相关,因此从Keras 2.X开始将其删除.用于这些指标的全球估算.

根据弗朗索瓦·乔勒(Francois Chollet)(2017年3月19日)( https://github.com /keras-team/keras/issues/5794 ):

基本上,所有这些都是近似的全局指标 分批处理,这比帮助更容易引起误解.提到了 在文档中,但是将它们完全删除要干净得多.那是个 首先将它们合并的错误.

但是,在TensorFlow 2.0(tensorflow.keras.metrics)中,它们使用专用的内置累加器,并且计算正确,因此与您的数据集相关.您可以在此处找到更详细的描述:

https://www.tensorflow.org /api_docs/python/tf/keras/metrics/Recall?version = stable

我的强烈建议:使用内置指标,而无需手动实施它们,特别是因为您自然会分批实现它们.

如果您在加载模型时遇到问题,请确保以下几点:

  • 确保已安装Python 3(> = 3.6.X)
  • 如果问题仍然存在,请通过参考以下代码段确保将自定义信息传递给load_model:

      metric_config_dict = {
           'precision': precision
       }
    
       model = tensorflow.keras.models.load_model('path_to_my_model.hdf5',custom_objects= metric_config_dict)
    

在Keras 2.3.0发行时,Francois Chollet:

Keras 2.3.0是支持多后端Keras的第一个版本 TensorFlow 2.0.它与TensorFlow 1.14、1.13, 以及Theano和CNTK.

此版本自发布起使该API与tf.keras API保持同步 TensorFlow 2.0.但是请注意,它不支持大多数TensorFlow 2.0功能,特别是渴望执行.如果需要这些功能,请使用tf.keras.

这也是多后端Keras的最后一个主要发行版.去 向前,我们建议用户考虑切换其Keras代码 在TensorFlow 2.0中访问tf.keras.它实现了相同的Keras 2.3.0 API (因此切换应与更改Keras导入一样容易 语句),但对于TensorFlow用户有很多优势,例如 支持热切的执行,分发,TPU培训以及一般的支持 底层TensorFlow与高层之间的集成要好得多 诸如图层和模型之类的概念.也可以更好地维护它.

发展将集中于tf.keras的发展.我们会保持 在接下来的6个月内维护多后端Keras,但我们将 仅合并错误修复. API更改将不会被移植

因此,甚至Keras的创建者也建议我们切换到tf.keras而不是普通的keras.请同时输入您的代码,并检查问题是否仍然存在.如果将tf.keraskeras混合使用,将会得到各种各样的奇数错误.因此,将所有导入内容更改为tf.keras.有关TensorFlow 2.0和更多更改的更多信息,您可以查阅以下内容:https://datascience.stackexchange.com/questions/45165/how-to-get-accuracy-f1-precision-and-recall-for-a-keras-model/45166#45166?newreg=6190503b2be14e8aa2c0069d0a52749e, but for some reason the numbers were off (I do know about the average of batch problem, that's not what I'm talking about).

So I tried implementing another metric:

def p1(y_true, y_pred):
    return K.sum(y_true)

Just to see what would happen... What I'd expect is to see a straight line chart with the number of 1's I have in my dataset (I'm working on a binary classification problem with binary_crossentropy loss).

Because Keras computes custom metrics as averages of the results for each batch, if I have a batch of size 32, I'd expect this p1 metrics to return 16, but instead I got 15. If I use a batch of size 16, I get something close to 7.9. That was when I tried with the fit method.

I also calculated the validation precision manually after training the model and it does give me a different number than what I see as the last val_precision from history. That was using fir_generator, in which case batch_size is not provided, so I'm assuming it calculates the metric for the entire validation dataset at once.

Another important detail is that when I use the same dataset for training and validation, even when I get the same numbers for true positives and predicted positives at the last epoch, the training and validation precisions are different (1 and 0.6).

true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))

Apparently 32.0 / (32.0 + K.epsilon()) = 0.6000000238418579

Any idea what's wrong?

Something that might help:

def p1(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    return 1.0 / (true_positives + K.epsilon())

def p2(y_true, y_pred):
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    return 1.0 / (predicted_positives + K.epsilon())

def p3(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    return true_positives

def p4(y_true, y_pred):
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    return predicted_positives

解决方案

Honestly, I have run into the same problem at a point and to me, the best solution was to use Recall and Precision from built-in metrics.

Starting with TensorFlow 2.0, these two metrics are built-in tensorflow.keras.metrics, and they work well provided that you use binary_crossentropy with a Dense(1) at the final layer(in the end they are metrics for binary classification of course).

The main thing(important to note) is that the implementation is completely different than what you try to achieve and what was in Keras before.

In fact, in Keras 1.X version, all those metrics were available(F1-Score,Recall and Precision), but they were removed starting from Keras 2.X due to the fact that batch-wise estimation is not relevant for global estimation of these metrics.

According to Francois Chollet (March 19th 2017) (https://github.com/keras-team/keras/issues/5794):

Basically these are all global metrics that were approximated batch-wise, which is more misleading than helpful. This was mentioned in the docs but it's much cleaner to remove them altogether. It was a mistake to merge them in the first place.

However, in TensorFlow 2.0(tensorflow.keras.metrics), they use specialised built-in accumulators, and the computations are made properly, thus being relevant for your dataset. You can find a more detail description here:

https://www.tensorflow.org/api_docs/python/tf/keras/metrics/Recall?version=stable

My strong recommendation: use the built-in metrics, and skip implementing them by hand, particularly since you would naturally batch-wise implement them.

If you have issues with loading the model, please ensure the following:

  • Ensure that you have Python 3 installed(>=3.6.X)
  • If the issue persists, then ensure that custom information is passed to load_model by consulting the following snippet:

      metric_config_dict = {
           'precision': precision
       }
    
       model = tensorflow.keras.models.load_model('path_to_my_model.hdf5',custom_objects= metric_config_dict)
    

Francois Chollet on release of Keras 2.3.0 :

Keras 2.3.0 is the first release of multi-backend Keras that supports TensorFlow 2.0. It maintains compatibility with TensorFlow 1.14, 1.13, as well as Theano and CNTK.

This release brings the API in sync with the tf.keras API as of TensorFlow 2.0. However note that it does not support most TensorFlow 2.0 features, in particular eager execution. If you need these features, use tf.keras.

This is also the last major release of multi-backend Keras. Going forward, we recommend that users consider switching their Keras code to tf.keras in TensorFlow 2.0. It implements the same Keras 2.3.0 API (so switching should be as easy as changing the Keras import statements), but it has many advantages for TensorFlow users, such as support for eager execution, distribution, TPU training, and generally far better integration between low-level TensorFlow and high-level concepts like Layer and Model. It is also better maintained.

Development will focus on tf.keras going forward. We will keep maintaining multi-backend Keras over the next 6 months, but we will only be merging bug fixes. API changes will not be ported

Therefore, even the creator of Keras recommends that we switch to tf.keras instead of plain keras. Please also switch in your code and check if the problems still persist. If you mix tf.keras and keras, you will get all sorts of odd errors; thus change all your imports to tf.keras. For more information w.r.t TensorFlow 2.0 and more changes, you can consult this: https://www.pyimagesearch.com/2019/10/21/keras-vs-tf-keras-whats-the-difference-in-tensorflow-2-0/

这篇关于Keras自定义指标总和是错误的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆