为什么Keras Lambda-Layer会引起问题Mask_RCNN? [英] Why Keras Lambda-Layer cause problem Mask_RCNN?

查看:509
本文介绍了为什么Keras Lambda-Layer会引起问题Mask_RCNN?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用此仓库中的Mask_RCNN软件包:https://github.com/matterport/Mask_RCNN.

I'm using the Mask_RCNN package from this repo: https://github.com/matterport/Mask_RCNN.

我尝试使用此程序包训练自己的数据集,但一开始它给了我一个错误.

I tried to train my own dataset using this package but it gives me an error at the beginning.

2020-11-30 12:13:16.577252: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-11-30 12:13:16.587017: E tensorflow/stream_executor/cuda/cuda_driver.cc:314] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-11-30 12:13:16.587075: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (7612ade969e5): /proc/driver/nvidia/version does not exist
2020-11-30 12:13:16.587479: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2020-11-30 12:13:16.593569: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2300000000 Hz
2020-11-30 12:13:16.593811: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x1b2aa00 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-11-30 12:13:16.593846: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Traceback (most recent call last):
  File "machines.py", line 345, in <module>
    model_dir=args.logs)
  File "/content/Mask_RCNN/mrcnn/model.py", line 1837, in __init__
    self.keras_model = self.build(mode=mode, config=config)
  File "/content/Mask_RCNN/mrcnn/model.py", line 1934, in build
    anchors = KL.Lambda(lambda x: tf.Variable(anchors), name="anchors")(input_image)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 926, in __call__
    input_list)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1117, in _functional_construction_call
    outputs = call_fn(cast_inputs, *args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py", line 904, in call
    self._check_variables(created_variables, tape.watched_variables())
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py", line 931, in _check_variables
    raise ValueError(error_str)
ValueError: 
The following Variables were created within a Lambda layer (anchors)
but are not tracked by said layer:
  <tf.Variable 'anchors/Variable:0' shape=(1, 261888, 4) dtype=float32>
The layer cannot safely ensure proper Variable reuse across multiple
calls, and consquently this behavior is disallowed for safety. Lambda
layers are not well suited to stateful computation; instead, writing a
subclassed Layer is the recommend way to define layers with
Variables.

我查找了负责该问题的代码部分(位于存储库中的file: /mrcnn/model.py line: 1935处): IN[0]: anchors = KL.Lambda(lambda x: tf.Variable(anchors), name="anchors")(input_image)

I looked up the part of code responsible for the problem (located at file: /mrcnn/model.py line: 1935 in the repo): IN[0]: anchors = KL.Lambda(lambda x: tf.Variable(anchors), name="anchors")(input_image)

如果任何人有解决方案或已经解决方案的想法,请提及解决方案.

If anyone have an idea how to solve it or have already solved it, please mention the solution.

推荐答案

根本原因: Tensorflow 2.X中Keras的Lambda层的行为已从Tensorflow 1.X更改. 在Tensorflow 1.X中的Keras中,所有tf.Variable和tf.get_variable都会通过变量创建者上下文自动跟踪到layer.weights中,以便它们自动接收渐变和可训练.这种方法在自动图编译中存在问题,该图将Python代码转换为Tensorflow 2.X中的执行图,因此将其删除,现在Lambda层具有检查变量创建并引发错误的代码,如您所见.简而言之,Tensorflow 2.X中的Lambda层必须是无状态的.如果要创建变量,Tensorflow 2.X中的正确方法是将图层类子类化,并在类成员中添加可训练的权重.

ROOT CAUSE: The bahavior of Lambda layer of Keras in Tensorflow 2.X was changed from Tensorflow 1.X. In Keras in Tensorflow 1.X, all tf.Variable and tf.get_variable are automatically tracked into the layer.weights via variable creator context so they receive gradient and trainable automatically. Such approach has problem with auto graph compilation that convert Python code into Execution Graph in Tensorflow 2.X so it is removed and now Lambda layer has the code to check for variable creation and raise the error as you see. In short, Lambda layer in Tensorflow 2.X has to be stateless. If you want to create variable, the correct way in Tensorflow 2.X is to subclass layer class and add trainable weight as a class member.

解决方案: 有2个选择-

SOLUTIONS: There are 2 choices -

  1. 更改为使用Tensorflow1.X.不会出现此错误.

  1. Change to use Tensorflow 1.X.. This error will not be raised.

用Keras层的子类替换Lambda层:

Replace the Lambda layer with subclass of Keras Layer:

class AnchorsLayer(tensorflow.keras.layers.Layer):

   def __init__(self, anchors):
     super(AnchorLayer, self).__init__()
     self.anchors_v = tf.Variable(anchors)
   
   def call(self):
     return self.anchors_v

# Then replace the Lambda call with this:
   
   anchors_layer = AnchorLayers(anchors)
   anchors = anchors_layer()

这篇关于为什么Keras Lambda-Layer会引起问题Mask_RCNN?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆