对张量流损失类(categorical_crossentropy)进行子分类以创建加权损失函数时意外的关键字参数“sample_weight" [英] unexpected keyword argument 'sample_weight' when sub-classing tensor-flow loss class (categorical_crossentropy) to created a weighted loss function

查看:43
本文介绍了对张量流损失类(categorical_crossentropy)进行子分类以创建加权损失函数时意外的关键字参数“sample_weight"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

努力使子分类损失函数在 Tensorflow (2.2.0) 中工作.最初尝试过此代码(我知道它对其他人有用 - 请参阅 https://github.com/keras-team/keras/issues/2115#issuecomment-530762739):

Struggling to get a sub-classed loss function to work in Tensorflow (2.2.0). Initially tried this code (which I know has worked for others - see https://github.com/keras-team/keras/issues/2115#issuecomment-530762739):

import tensorflow.keras.backend as K
from tensorflow.keras.losses import CategoricalCrossentropy


class WeightedCategoricalCrossentropy(CategoricalCrossentropy):

    def __init__(self, cost_mat, name='weighted_categorical_crossentropy', **kwargs):
        assert(cost_mat.ndim == 2)
        assert(cost_mat.shape[0] == cost_mat.shape[1])

        super().__init__(name=name, **kwargs)
        self.cost_mat = K.cast_to_floatx(cost_mat)

    def __call__(self, y_true, y_pred):

        return super().__call__(
            y_true=y_true,
            y_pred=y_pred,
            sample_weight=get_sample_weights(y_true, y_pred, self.cost_mat),
        )

def get_sample_weights(y_true, y_pred, cost_m):
    num_classes = len(cost_m)

    y_pred.shape.assert_has_rank(2)
    y_pred.shape[1].assert_is_compatible_with(num_classes)
    y_pred.shape.assert_is_compatible_with(y_true.shape)

    y_pred = K.one_hot(K.argmax(y_pred), num_classes)

    y_true_nk1 = K.expand_dims(y_true, 2)
    y_pred_n1k = K.expand_dims(y_pred, 1)
    cost_m_1kk = K.expand_dims(cost_m, 0)

    sample_weights_nkk = cost_m_1kk * y_true_nk1 * y_pred_n1k
    sample_weights_n = K.sum(sample_weights_nkk, axis=[1, 2])

    return sample_weights_n

用法如下:

model.compile(optimizer='adam',
              loss={'simple_Class': 'categorical_crossentropy',
                    'soundClass': 'binary_crossentropy', 
                    'auxiliary_soundClass':'binary_crossentropy',
                    'auxiliary_class_training': WeightedCategoricalCrossentropy(cost_matrix), 
                    'class_training':WeightedCategoricalCrossentropy(cost_matrix)
},

              loss_weights={'simple_Class': 1.0,
                            'soundClass': 1.0, 
                            'auxiliary_soundClass':0.7,
                            'auxiliary_class_training': 0.7,
                            'class_training':0.4})

(其中 cost_matrix 是一个二维 numpy 数组).使用 batch_size=512 通过 model.fit() 进行训练.但是,这会导致以下错误:

(where cost_matrix is a 2-dimensional numpy array). Training trough model.fit() with batch_size=512. However, this results in the following error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-21-3428d6d8967a> in <module>()
     82          'class_training': class_lables_test}),
     83 
---> 84     epochs=nb_epoch, batch_size=batch_size, initial_epoch=initial_epoch, verbose=0, shuffle=True, callbacks=[se, tb, cm, mc, es, rs])
     85 
     86 #model.save(save_version_dir,save_format='tf')

10 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in _method_wrapper(self, *args, **kwargs)
     64   def _method_wrapper(self, *args, **kwargs):
     65     if not self._in_multi_worker_mode():  # pylint: disable=protected-access
---> 66       return method(self, *args, **kwargs)
     67 
     68     # Running inside `run_distribute_coordinator` already.

/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py in fit(self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, validation_batch_size, validation_freq, max_queue_size, workers, use_multiprocessing)
    846                 batch_size=batch_size):
    847               callbacks.on_train_batch_begin(step)
--> 848               tmp_logs = train_function(iterator)
    849               # Catch OutOfRangeError for Datasets of unknown size.
    850               # This blocks until the batch has finished executing.

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in __call__(self, *args, **kwds)
    578         xla_context.Exit()
    579     else:
--> 580       result = self._call(*args, **kwds)
    581 
    582     if tracing_count == self._get_tracing_count():

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in _call(self, *args, **kwds)
    625       # This is the first call of __call__, so we have to initialize.
    626       initializers = []
--> 627       self._initialize(args, kwds, add_initializers_to=initializers)
    628     finally:
    629       # At this point we know that the initialization is complete (or less

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in _initialize(self, args, kwds, add_initializers_to)
    504     self._concrete_stateful_fn = (
    505         self._stateful_fn._get_concrete_function_internal_garbage_collected(  # pylint: disable=protected-access
--> 506             *args, **kwds))
    507 
    508     def invalid_creator_scope(*unused_args, **unused_kwds):

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _get_concrete_function_internal_garbage_collected(self, *args, **kwargs)
   2444       args, kwargs = None, None
   2445     with self._lock:
-> 2446       graph_function, _, _ = self._maybe_define_function(args, kwargs)
   2447     return graph_function
   2448 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _maybe_define_function(self, args, kwargs)
   2775 
   2776       self._function_cache.missed.add(call_context_key)
-> 2777       graph_function = self._create_graph_function(args, kwargs)
   2778       self._function_cache.primary[cache_key] = graph_function
   2779       return graph_function, args, kwargs

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py in _create_graph_function(self, args, kwargs, override_flat_arg_shapes)
   2665             arg_names=arg_names,
   2666             override_flat_arg_shapes=override_flat_arg_shapes,
-> 2667             capture_by_value=self._capture_by_value),
   2668         self._function_attributes,
   2669         # Tell the ConcreteFunction to clean up its graph once it goes out of

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in func_graph_from_py_func(name, python_func, args, kwargs, signature, func_graph, autograph, autograph_options, add_control_dependencies, arg_names, op_return_value, collections, capture_by_value, override_flat_arg_shapes)
    979         _, original_func = tf_decorator.unwrap(python_func)
    980 
--> 981       func_outputs = python_func(*func_args, **func_kwargs)
    982 
    983       # invariant: `func_outputs` contains only Tensors, CompositeTensors,

/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/def_function.py in wrapped_fn(*args, **kwds)
    439         # __wrapped__ allows AutoGraph to swap in a converted function. We give
    440         # the function a weak reference to itself to avoid a reference cycle.
--> 441         return weak_wrapped_fn().__wrapped__(*args, **kwds)
    442     weak_wrapped_fn = weakref.ref(wrapped_fn)
    443 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/func_graph.py in wrapper(*args, **kwargs)
    966           except Exception as e:  # pylint:disable=broad-except
    967             if hasattr(e, "ag_error_metadata"):
--> 968               raise e.ag_error_metadata.to_exception(e)
    969             else:
    970               raise

TypeError: in user code:

    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:571 train_function  *
        outputs = self.distribute_strategy.run(
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:951 run  **
        return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2290 call_for_each_replica
        return self._call_for_each_replica(fn, args, kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/distribute/distribute_lib.py:2649 _call_for_each_replica
        return fn(*args, **kwargs)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/training.py:533 train_step  **
        y, y_pred, sample_weight, regularization_losses=self.losses)
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/compile_utils.py:205 __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)

    TypeError: __call__() got an unexpected keyword argument 'sample_weight'

当我用 call() 替换 __call__() 魔术方法并手动实现一些底层逻辑时,这个问题得到解决.这有效,用法相同.__call__ 方法改为:

This problem is resolved when I replace the __call__() magic methods with call() and implement some of the underlying logic manually. This works, with the same usage. The __call__ method is changed to:

def call(self, y_true, y_pred):
    return super().call(y_true, y_pred) * get_sample_weights(y_true, y_pred, self.cost_mat)

即我们计算 y_truey_pred 的分类交叉熵损失,然后直接乘以我们的权重矩阵,而不是传递 y_true, y_predself-cost_mat 到分类交叉熵 call 方法,并使用继承方法自己的逻辑将损失乘以权重.这不是一个大问题,因为代码确实有效 - 但我无法弄清楚为什么我无法正确使用继承类自己的 __call__ 实现(根据原始代码).

i.e. we calculate a categorical cross-entropy loss on y_true and y_pred and then multiply against our weight matrix directly, rather than passing y_true, y_pred and self-cost_mat to the categorical cross-entropy call method and use the inherited method's own logic for multiplying the loss by the weights. This isn't a massive problem, as the code does work - but I can't figure out why I was unable to use the inherited class' own __call__ implementation properly (as per the original code).

我也将 y_pred.shape[1].assert_is_compatible_with(num_classes) 更改为 assert(y_pred.shape[1] == num_classes) - 这是因为 y_pred.shape[1] 返回一个 int.我不知道为什么,因为检查 y_pred,它当然是一个 tf.Tensor,所以,.shape[1] 应该返回一个 tf.TesnorShape 对象,在该对象上可以调用 .assert_is_compatible_with().

Also I changed y_pred.shape[1].assert_is_compatible_with(num_classes) to assert(y_pred.shape[1] == num_classes) - this was because y_pred.shape[1] was returning an int. I have no idea why, as, inspecting y_pred, it is, of course, a tf.Tensor, and so, .shape[1] should return a tf.TesnorShape object, upon which .assert_is_compatible_with() could be called on.

这是我成功使用的整个类实现.

This is the whole class implementation that I've used successfully.

注意 - 它包括 from_config 和 get_config 方法,以及对 Keras 损失命名空间(最后一行)的显式分配,以通过 model.save(save_format='tf')model.save(save_format='tf').其中一些功能很难实现:我必须实现对 NumPy 数组的显式转换(请参阅 __init__ 方法的第一行).

Note - it includes from_config and get_config methods, alongside an explicit assignment to the Keras loss namespace (last line) to enable whole-model + optimizer state saving through model.save(save_format='tf'). Some of this functionality was challenging to get working: I had to implement an explicit cast to a NumPy array (see the first line of __init__ method).

class WeightedCategoricalCrossentropy(tensorflow.keras.losses.CategoricalCrossentropy):

  def __init__(self, cost_mat, name='weighted_categorical_crossentropy', **kwargs):

    cost_mat = np.array(cost_mat)   
    ## when loading from config, self.cost_mat returns as a list, rather than an numpy array. 
    ## Adding the above line fixes this issue, enabling .ndim to call sucessfully. 
    ## However, this is probably not the best implementation

    assert(cost_mat.ndim == 2)
    assert(cost_mat.shape[0] == cost_mat.shape[1])
    super().__init__(name=name, **kwargs)
    self.cost_mat = K.cast_to_floatx(cost_mat)

  def call(self, y_true, y_pred):
    return super().call(y_true, y_pred) * get_sample_weights(y_true, y_pred, self.cost_mat)

  def get_config(self):
    config = super().get_config().copy()
    # Calling .update on the line above, during assignment, causes an error with config becoming None-type.
    config.update({'cost_mat': (self.cost_mat)})
    return config

  @classmethod
  def from_config(cls, config):
    # something goes wrong here and changes self.cost_mat to a list variable.
    # See above for temporary fix
    return cls(**config)

def get_sample_weights(y_true, y_pred, cost_m):
    num_classes = len(cost_m)

    y_pred.shape.assert_has_rank(2)
    assert(y_pred.shape[1] == num_classes)
    y_pred.shape.assert_is_compatible_with(y_true.shape)

    y_pred = K.one_hot(K.argmax(y_pred), num_classes)

    y_true_nk1 = K.expand_dims(y_true, 2)
    y_pred_n1k = K.expand_dims(y_pred, 1)
    cost_m_1kk = K.expand_dims(cost_m, 0)

    sample_weights_nkk = cost_m_1kk * y_true_nk1 * y_pred_n1k
    sample_weights_n = K.sum(sample_weights_nkk, axis=[1, 2])

    return sample_weights_n


tf.keras.losses.WeightedCategoricalCrossentropy = WeightedCategoricalCrossentropy

最后,保存模型是这样实现的:

Finally, saving the model is implemented like so:

model.save(save_version_dir,save_format='tf')

并按如下方式加载模型:

and loading the model as follows:

model = tf.keras.models.load_model(
          save_version_dir,
          compile=True,
          custom_objects={
             'WeightedCategoricalCrossentropy': WeightedCategoricalCrossentropy(cost_matrix)
              }
           )

推荐答案

根据评论;这里的问题是 TensorFlow 现在强制继承原始方法签名.

As per the comments; the issue here is that TensorFlow is now enforcing inheriting from the original method signature.

以下已在玩具问题上进行了测试(通过比较 cost_matrix 中的相等权重与将除单个类别之外的所有类别加权为无)并且有效:

The following has been tested (by comparing equal weighting in the cost_matrix to weighting all but a single category to nothing) on a toy problem and works:

class WeightedCategoricalCrossentropy(tf.keras.losses.CategoricalCrossentropy):

  def __init__(self, cost_mat, name='weighted_categorical_crossentropy', **kwargs):

    cost_mat = np.array(cost_mat)   
    ## when loading from config, self.cost_mat returns as a list, rather than an numpy array. 
    ## Adding the above line fixes this issue, enabling .ndim to call sucessfully. 
    ## However, this is probably not the best implementation
    assert(cost_mat.ndim == 2)
    assert(cost_mat.shape[0] == cost_mat.shape[1])
    super().__init__(name=name, **kwargs)
    self.cost_mat = K.cast_to_floatx(cost_mat)

  def __call__(self, y_true, y_pred, sample_weight=None):
    assert sample_weight is None, "should only be derived from the cost matrix"  
    return super().__call__(
        y_true=y_true, 
        y_pred=y_pred, 
        sample_weight=get_sample_weights(y_true, y_pred, self.cost_mat),
    )


  def get_config(self):
    config = super().get_config().copy()
    # Calling .update on the line above, during assignment, causes an error with config becoming None-type.
    config.update({'cost_mat': (self.cost_mat)})
    return config

  @classmethod
  def from_config(cls, config):
    # something goes wrong here and changes self.cost_mat to a list variable.
    # See above for temporary fix
    return cls(**config)

def get_sample_weights(y_true, y_pred, cost_m):
    num_classes = len(cost_m)

    y_pred.shape.assert_has_rank(2)
    assert(y_pred.shape[1] == num_classes)
    y_pred.shape.assert_is_compatible_with(y_true.shape)

    y_pred = K.one_hot(K.argmax(y_pred), num_classes)

    y_true_nk1 = K.expand_dims(y_true, 2)
    y_pred_n1k = K.expand_dims(y_pred, 1)
    cost_m_1kk = K.expand_dims(cost_m, 0)

    sample_weights_nkk = cost_m_1kk * y_true_nk1 * y_pred_n1k
    sample_weights_n = K.sum(sample_weights_nkk, axis=[1, 2])

    return sample_weights_n


# Register the loss in the Keras namespace to enable loading of the custom object.
tf.keras.losses.WeightedCategoricalCrossentropy = WeightedCategoricalCrossentropy

使用

其中 cost_matrix 是一个二维 NumPy 数组,例如:

Where cost_matrix is a 2D NumPy array, eg:

[
 [ Weight Category 1 predicted as Category 1, 
   Weight Category 1 predicted as Category 2,
   Weight Category 1 predicted as Category 3 ]
 [ Weight Category 2 predicted as Category 1,
   ...,
   ...                                       ]
 [ ...,
   ...,
   Weight Category 3 predicted as Category 3 ]
]


model.compile(
     optimizer='adam',
     loss=WeightedCategoricalCrossentropy(cost_matrix)
     )

模型保存

model.save(save_version_dir,save_format='tf')

模型加载

model = tf.keras.models.load_model(
    save_version_dir,
    compile=True,
    custom_objects={
        'WeightedCategoricalCrossentropy': WeightedCategoricalCrossentropy(cost_matrix)
        }
    )

这篇关于对张量流损失类(categorical_crossentropy)进行子分类以创建加权损失函数时意外的关键字参数“sample_weight"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆