使用 GradientTape() 计算偏置项的梯度 [英] gradient calculation for bias term using GradientTape()

查看:25
本文介绍了使用 GradientTape() 计算偏置项的梯度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想分别计算关于权重变量和偏置项的梯度张量.权重变量的梯度计算正确,但偏差的梯度计算得不好.请让我知道问题是什么,或者正确修改我的代码.

I want to calculate gradient tensors with respect to weight variables and bias term, separately. The gradient for weight variables is calculated correctly, But the gradient for bias is NOT computed well. Please, let me know what the problem is, or modify my code correctly.

import numpy as np
import tensorflow as tf

X =tf.constant([[1.0,0.1,-1.0],[2.0,0.2,-2.0],[3.0,0.3,-3.0],[4.0,0.4,-4.0],[5.0,0.5,-5.0]])
b1 = tf.Variable(-0.5)
Bb = tf.constant([ [1.0], [1.0], [1.0], [1.0], [1.0] ]) 
Bb = b1* Bb

Y0 = tf.constant([ [-10.0], [-5.0], [0.0], [5.0], [10.0] ])

W = tf.Variable([ [1.0], [1.0], [1.0] ])

with tf.GradientTape() as tape: 
    Y = tf.matmul(X, W) + Bb
    print("Y : ", Y.numpy())

    loss_val = tf.reduce_sum(tf.square(Y - Y0))  
    print("loss : ", loss_val.numpy())

gw = tape.gradient(loss_val, W)   # gradient calculation works well 
gb = tape.gradient(loss_val, b1)  # does NOT work

print("gradient W : ", gw.numpy())
print("gradient b : ", gb.numpy())

推荐答案

两件事.首先,如果您查看此处的文档 -

Two things. Firstly if you look at the docs here -

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape#args

你会看到你只能对 gradient 进行一次调用,除非 persistent=True

you'll see that you can only make a single call to gradient unless persistent=True

其次,您在磁带的上下文管理器之外设置 Bb = b1* Bb,因此不会记录此操作.

Secondly, you're setting Bb = b1* Bb outside of the context manager for the tape so this op is not being recorded.

import numpy as np
import tensorflow as tf

X =tf.constant([[1.0,0.1,-1.0],[2.0,0.2,-2.0],[3.0,0.3,-3.0],[4.0,0.4,-4.0],[5.0,0.5,-5.0]])
b1 = tf.Variable(-0.5)
Bb = tf.constant([ [1.0], [1.0], [1.0], [1.0], [1.0] ]) 


Y0 = tf.constant([ [-10.0], [-5.0], [0.0], [5.0], [10.0] ])

W = tf.Variable([ [1.0], [1.0], [1.0] ])

with tf.GradientTape(persistent=True) as tape: 
    Bb = b1* Bb
    Y = tf.matmul(X, W) + Bb
    print("Y : ", Y.numpy())

    loss_val = tf.reduce_sum(tf.square(Y - Y0))  
    print("loss : ", loss_val.numpy())

gw = tape.gradient(loss_val, W)   # gradient calculation works well 
gb = tape.gradient(loss_val, b1)  # does NOT work

print("gradient W : ", gw.numpy())
print("gradient b : ", gb.numpy())

这篇关于使用 GradientTape() 计算偏置项的梯度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆