Tensorflow 2.0不计算梯度 [英] Tensorflow 2.0 doesn't compute the gradient

查看:500
本文介绍了Tensorflow 2.0不计算梯度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想可视化CNN中给定功能图所学的模式(在此示例中,我使用的是vgg16)。为此,我创建了一个随机图像,通过网络馈送到所需的卷积层,选择特征图并找到相对于输入的渐变。想法是以最大程度激活所需特征图的方式更改输入。使用tensorflow 2.0我有一个GradientTape跟随该函数,然后计算梯度,但是该梯度返回None,为什么它无法计算梯度?<​​/ p>

I want to visualize the patterns that a given feature map in a CNN has learned (in this example I'm using vgg16). To do so I create a random image, feed through the network up to the desired convolutional layer, choose the feature map and find the gradients with the respect to the input. The idea is to change the input in such a way that will maximize the activation of the desired feature map. Using tensorflow 2.0 I have a GradientTape that follows the function and then computes the gradient, however the gradient returns None, why is it unable to compute the gradient?

import tensorflow as tf
import matplotlib.pyplot as plt
import time
import numpy as np
from tensorflow.keras.applications import vgg16

class maxFeatureMap():

    def __init__(self, model):

        self.model = model
        self.optimizer = tf.keras.optimizers.Adam()

    def getNumLayers(self, layer_name):

        for layer in self.model.layers:
            if layer.name == layer_name:
                weights = layer.get_weights()
                num = weights[1].shape[0]
        return ("There are {} feature maps in {}".format(num, layer_name))

    def getGradient(self, layer, feature_map):

        pic = vgg16.preprocess_input(np.random.uniform(size=(1,96,96,3))) ## Creates values between 0 and 1
        pic = tf.convert_to_tensor(pic)

        model = tf.keras.Model(inputs=self.model.inputs, 
                               outputs=self.model.layers[layer].output)
        with tf.GradientTape() as tape:
            ## predicts the output of the model and only chooses the feature_map indicated
            predictions = model.predict(pic, steps=1)[0][:,:,feature_map]
            loss = tf.reduce_mean(predictions)
        print(loss)
        gradients = tape.gradient(loss, pic[0])
        print(gradients)
        self.optimizer.apply_gradients(zip(gradients, pic))

model = vgg16.VGG16(weights='imagenet', include_top=False)


x = maxFeatureMap(model)
x.getGradient(1, 24)


推荐答案

这是 GradientTape 的常见陷阱;磁带仅跟踪设置为受监视的张量,默认情况下,磁带仅监视可训练的变量(表示使用 trainable创建的 tf.Variable 对象= True )。要观看 pic 张量,应在磁带上下文中添加 tape.watch(pic)作为第一行

This is a common pitfall with GradientTape; the tape only traces tensors that are set to be "watched" and by default tapes will watch only trainable variables (meaning tf.Variable objects created with trainable=True). To watch the pic tensor, you should add tape.watch(pic) as the very first line inside the tape context.

此外,我不确定索引( pic [0] )是否可以正常工作,所以您可能想删除它-因为 pic 在第一维中只有一个条目,所以无论如何都没关系。

Also, I'm not sure if the indexing (pic[0]) will work, so you might want to remove that -- since pic has just one entry in the first dimension it shouldn't matter anyway.

此外,您不能使用 model.predict ,因为这将返回一个numpy数组,该数组基本上会破坏计算图链,因此不会向后传播梯度。您应该简单地将模型用作可调用模型,即 predictions = model(pic)

Furthermore, you cannot use model.predict because this returns a numpy array, which basically "destroys" the computation graph chain so gradients won't be backpropagated. You should simply use the model as a callable, i.e. predictions = model(pic).

这篇关于Tensorflow 2.0不计算梯度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆