Keras渐变WRT其他 [英] Keras gradient wrt something else

查看:75
本文介绍了Keras渐变WRT其他的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力实现文章

I am working to implement the method described in the article https://drive.google.com/file/d/1s-qs-ivo_fJD9BU_tM5RY8Hv-opK4Z-H/view . The final algorithm to use is here (it is on page 6):

  • d是单位向量
  • xhi是一个非空数字
  • D是损失函数(在我的情况下是稀疏的交叉熵)

想法是进行对抗训练,方法是在网络对小变化最敏感的方向上修改数据,并使用修改后的数据但与原始数据具有相同的标签来训练网络.

The idea is to do an adversarial training, by modifying the data in the direction where the network is the most sensible to small changes and training the network with the modified data but with the same label as the original data.

我正在尝试使用MNIST数据集和100个数据的小批量在Keras中实现此方法,但是我无法理解梯度wrt r的计算(第3步的第一行)算法).我不知道如何用Keras计算它.这是我的代码:

I am trying to implement this method in Keras with the MNIST dataset and a mini-batch of 100 data, but I can't get my head around with the computation of the gradient wrt r (first line of the 3rd step of the algorithm). I can't figure out how to compute it with Keras. Here is my code :

loss = losses.SparseCategoricalCrossentropy()

for epoch in range(5):
    print(f"Start of epoch {epoch}")
    for step, (xBatchTrain,yBatchTrain) in enumerate(trainDataset):
        #Generating the 100 unit vectors
        randomVectors = np.random.random(xBatchTrain.shape)
        U = randomVectors / np.linalg.norm(randomVectors,axis=1)[:, None]

        #Generating the r vectors
        Xi = 2
        R = tf.convert_to_tensor(U * Xi[:, None],dtype='float32')

        dataNoised = xBatchTrain + R

        with tf.GradientTape(persistent=True) as imTape:
            imTape.watch(R)
            #Geting the losses
            C = [loss(label,pred) for label, pred in zip(yBatchTrain,dumbModel(dataNoised,training=False))]

        #Getting the gradient wrt r for each images
        for l,r in zip(C,R):
            print(imTape.gradient(l,r))

"print"对于每个样本,该行均返回None.我应该给我返回一个784个值的矢量,每个矢量代表一个像素?

The "print" line returns None for every sample. I should return me a vector of 784 values, each for one pixel?

(我很抱歉,代码的一部分很丑陋,我是Keras,tf和深度学习的新手)

(I apologize is part of the code is ugly, I am new to Keras, tf and deep learning)

以下是整个笔记本的要点: https://gist.github.com/DridriLaBastos/136a8e9d02b311e82fe22ec1c2850f78

Here is a gist with the whole notebook: https://gist.github.com/DridriLaBastos/136a8e9d02b311e82fe22ec1c2850f78

推荐答案

首先,将dataNoised = xBatchTrain + R移到with tf.GradientTape(persistent=True) as imTape:内部,以记录与R

First, move dataNoised = xBatchTrain + R inside of with tf.GradientTape(persistent=True) as imTape: to recording the operation related to R

第二,而不是使用:

for l,r in zip(C,R):
    print(imTape.gradient(l,r))

您应该使用imTape.gradient(C,R)来获取渐变集,因为zip将打破R张量中的运算依赖性,将其打印出来将返回与xBatchTrain相同的形状:

You should using imTape.gradient(C,R) to get sets of gradient, since zip will break the operation dependency in the tensor of R, print it out which will return something like following as same shape as xBatchTrain:

tf.Tensor(
[[-1.4924371e-06  1.0490652e-05 -1.8195267e-05 ...  1.5640746e-05
   3.3767541e-05 -2.0983218e-05]
 [ 2.3668531e-02  1.9133706e-02  3.1396169e-02 ... -1.4431887e-02
   5.3144591e-03  6.2225698e-03]
 [ 2.0492254e-03  7.1049971e-04  1.6121448e-03 ... -1.0579333e-03
   2.4968456e-03  8.3572773e-04]
 ...
 [-4.5572519e-03  6.2278998e-03  6.8322839e-03 ... -2.1966733e-03
   1.0822206e-03  1.8687058e-03]
 [-6.3691144e-03 -4.1699030e-02 -9.3158096e-02 ... -2.9496195e-02
  -7.0264392e-02 -3.2520775e-02]
 [-1.4666058e-02  2.0758331e-02  2.9009990e-02 ... -3.2206681e-02
   3.1550713e-02  4.9267178e-03]], shape=(100, 784), dtype=float32)

这篇关于Keras渐变WRT其他的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆