了解"backward()":如何从头开始编写Pytorch函数".backward()"? [英] Understanding ‘backward()’: How to code the Pytorch function ‘.backward()’ from scratch?

查看:134
本文介绍了了解"backward()":如何从头开始编写Pytorch函数".backward()"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是学习深度学习的新手,我一直试图了解Pytorch的'.backward()'是做什么的,因为它在那儿几乎完成了大部分工作.因此,我试图详细了解什么是向后功能,因此,我将尝试逐步编写该功能的代码.您可以推荐我的任何资源(书籍,视频,GitHub存储库)开始对该功能进行编码?谢谢您的时间,希望对您有所帮助.

I’m a newbie learning Deep Learning, I’m stuck trying to understand what ‘.backward()’ from Pytorch does since it does pretty much most the work there. Therefor, I’m trying to understand what backward function does in detail so, I’m going to try to code what the function does step by step. Any resource that you can recommend me (Book, video, GitHub repo) to start coding the function? Thank you for time and hopefully for your help.

推荐答案

backward() 正在计算相对于(wrt)图叶的梯度. grad() 函数是更一般地说,它可以计算出梯度wrt任何输入(包括叶).

backward() is calculating the gradients with respect to (w.r.t.) graph leaves. grad() function is more general it can calculate the gradients w.r.t. any inputs (leaves included).

我实现了 grad()函数,前一段时间,您可以检查一下.它使用自动微分(AD)的功能.

I implemented grad() function, some time ago, you may check this. It uses the power of Automatic Differentiation (AD).

import math
class ADNumber:
    
    def __init__(self,val, name=""): 
        self.name=name
        self._val=val
        self._children=[]         
        
    def __truediv__(self,other):
        new = ADNumber(self._val / other._val, name=f"{self.name}/{other.name}")
        self._children.append((1.0/other._val,new))
        other._children.append((-self._val/other._val**2,new)) # first derivation of 1/x is -1/x^2
        return new 

    def __mul__(self,other):
        new = ADNumber(self._val*other._val, name=f"{self.name}*{other.name}")
        self._children.append((other._val,new))
        other._children.append((self._val,new))
        return new

    def __add__(self,other):
        if isinstance(other, (int, float)):
            other = ADNumber(other, str(other))
        new = ADNumber(self._val+other._val, name=f"{self.name}+{other.name}")
        self._children.append((1.0,new))
        other._children.append((1.0,new))
        return new

    def __sub__(self,other):
        new = ADNumber(self._val-other._val, name=f"{self.name}-{other.name}")
        self._children.append((1.0,new))
        other._children.append((-1.0,new))
        return new
    
            
    @staticmethod
    def exp(self):
        new = ADNumber(math.exp(self._val), name=f"exp({self.name})")
        self._children.append((self._val,new))
        return new

    @staticmethod
    def sin(self):
        new = ADNumber(math.sin(self._val), name=f"sin({self.name})")      
        self._children.append((math.cos(self._val),new)) # first derivation is cos
        return new
    
    def grad(self,other):
        if self==other:            
            return 1.0
        else:
            result=0.0
            for child in other._children:                 
                result+=child[0]*self.grad(child[1])                
            return result 
        
A = ADNumber # shortcuts
sin = A.sin
exp = A.exp

def print_childs(f, wrt): # with respect to
    for e in f._children:
        print("child:", wrt, "->" , e[1].name, "grad: ", e[0])
        print_child(e[1], e[1].name)
        
    
x1 = A(1.5, name="x1")
x2 = A(0.5, name="x2")
f=(sin(x2)+1)/(x2+exp(x1))+x1*x2

print_childs(x2,"x2")
print("\ncalculated gradient for the function f with respect to x2:", f.grad(x2))

出局:

child: x2 -> sin(x2) grad:  0.8775825618903728
child: sin(x2) -> sin(x2)+1 grad:  1.0
child: sin(x2)+1 -> sin(x2)+1/x2+exp(x1) grad:  0.20073512936690338
child: sin(x2)+1/x2+exp(x1) -> sin(x2)+1/x2+exp(x1)+x1*x2 grad:  1.0
child: x2 -> x2+exp(x1) grad:  1.0
child: x2+exp(x1) -> sin(x2)+1/x2+exp(x1) grad:  -0.05961284871202578
child: sin(x2)+1/x2+exp(x1) -> sin(x2)+1/x2+exp(x1)+x1*x2 grad:  1.0
child: x2 -> x1*x2 grad:  1.5
child: x1*x2 -> sin(x2)+1/x2+exp(x1)+x1*x2 grad:  1.0

calculated gradient for the function f with respect to x2: 1.6165488003791766

这篇关于了解"backward()":如何从头开始编写Pytorch函数".backward()"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆