在具有急切执行功能的TensorFlow 2.0中,如何计算特定层的网络输出的梯度? [英] In TensorFlow 2.0 with eager-execution, how to compute the gradients of a network output wrt a specific layer?

查看:356
本文介绍了在具有急切执行功能的TensorFlow 2.0中,如何计算特定层的网络输出的梯度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由InceptionNet构成的网络,对于输入样本bx,我想计算模型输出w.r.t.的梯度.隐藏层.我有以下代码:

I have a network made with InceptionNet, and for an input sample bx, I want to compute the gradients of the model output w.r.t. the hidden layer. I have the following code:

bx = tf.reshape(x_batch[0, :, :, :], (1, 299, 299, 3))


with tf.GradientTape() as gtape:
    #gtape.watch(x)
    preds = model(bx)
    print(preds.shape, end='  ')

    class_idx = np.argmax(preds[0])
    print(class_idx, end='   ')

    class_output = model.output[:, class_idx]
    print(class_output, end='   ')

    last_conv_layer = model.get_layer('inception_v3').get_layer('mixed10')
    #gtape.watch(last_conv_layer)
    print(last_conv_layer)


grads = gtape.gradient(class_output, last_conv_layer.output)#[0]
print(grads)

但是,这将给出None.我也尝试过gtape.watch(bx),但是它仍然给出None.

But, this will give None. I tried gtape.watch(bx) as well, but it still gives None.

在尝试GradientTape之前,我尝试使用tf.keras.backend.gradient,但这产生了如下错误:

Before trying GradientTape, I tried using tf.keras.backend.gradient but that gave an error as follows:

RuntimeError: tf.gradients is not supported when eager execution is enabled. Use tf.GradientTape instead.

我的模型如下:

model.summary()

Model: "sequential_4"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
inception_v3 (Model)         (None, 1000)              23851784  
_________________________________________________________________
dense_5 (Dense)              (None, 2)                 2002      
=================================================================
Total params: 23,853,786
Trainable params: 23,819,354
Non-trainable params: 34,432
_________________________________________________________________

任何解决方案都值得赞赏.如果还有其他方法可以计算出这些梯度,则不必是GradientTape.

Any solution is appreciated. It doesn't have to be GradientTape, if there is any other way to compute these gradients.

推荐答案

我遇到了与您相同的问题.我不确定这是否是解决问题的最干净的方法,但这是我的解决方案.

I had the same problem as you. I'm not sure if this is the cleanest way to solve the problem, but here's my solution.

我认为问题在于您需要将last_conv_layer.call(...)的实际返回值作为tape.watch()的参数传递.由于在model(bx)调用范围内按顺序调用了所有层,因此您必须以某种方式将一些代码注入此内部范围.我使用以下装饰器进行了此操作:

I think the problem is that you need to pass along the actual return value of last_conv_layer.call(...) as an argument to tape.watch(). Since all layers are called sequentially within the scope of the model(bx) call, you'll have to somehow inject some code into this inner scope. I did this using the following decorator:

def watch_layer(layer, tape):
    """
    Make an intermediate hidden `layer` watchable by the `tape`.
    After calling this function, you can obtain the gradient with
    respect to the output of the `layer` by calling:

        grads = tape.gradient(..., layer.result)

    """
    def decorator(func):
        def wrapper(*args, **kwargs):
            # Store the result of `layer.call` internally.
            layer.result = func(*args, **kwargs)
            # From this point onwards, watch this tensor.
            tape.watch(layer.result)
            # Return the result to continue with the forward pass.
            return layer.result
        return wrapper
    layer.call = decorator(layer.call)
    return layer

在您的示例中,我相信以下内容将为您工作:

In your example, I believe the following should then work for you:

bx = tf.reshape(x_batch[0, :, :, :], (1, 299, 299, 3))
last_conv_layer = model.get_layer('inception_v3').get_layer('mixed10')
with tf.GradientTape() as gtape:
    # Make the `last_conv_layer` watchable
    watch_layer(last_conv_layer, gtape)  
    preds = model(bx)
    class_idx = np.argmax(preds[0])
    class_output = model.output[:, class_idx]
# Get the gradient w.r.t. the output of `last_conv_layer`
grads = gtape.gradient(class_output, last_conv_layer.result)  
print(grads)

这篇关于在具有急切执行功能的TensorFlow 2.0中,如何计算特定层的网络输出的梯度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆