如何获得梯度和Hessian |同情 [英] How to get the Gradient and Hessian | Sympy

查看:52
本文介绍了如何获得梯度和Hessian |同情的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

情况是这样的:我有一个符号函数 lamb,它是变量 z 的元素和变量 h 的函数元素的函数.这是羔羊符号函数的图像

Here is the situation: I have a symbolic function lamb which is function of the elements of the variable z and the functions elements of the variable h. Here is an image of the lamb symbolic function

.

现在我想计算这个函数关于变量 eta 和 xi 的梯度和 Hessian.当然,我用谷歌搜索了它,但我找不到直接的方法来做到这一点.我发现的是这里,但正如我所说,它没有看到成为这种情况的最佳方法.任何想法?波纹管,源代码.谢谢.

Now I would like the compute the Gradient and Hessian of this function with respect to the variables eta and xi. Of course I googled for it but I could not find a straight way for doing this. What I found is here, but as I said it does not seen to be the best approach for this situation. Any idea? Bellow, the source code. Thanks.

from sympy import Symbol, Matrix, Function, simplify

eta = Symbol('eta')
xi = Symbol('xi')

x = Matrix([[xi],[eta]])

h = [Function('h_'+str(i+1))(x[0],x[1]) for i in range(3)]
z = [Symbol('z_'+str(i+1)) for i in range(3)]

lamb = 0
for i in range(3):
    lamb += 1/(2*sigma**2)*(z[i]-h[i])**2
simplify(lamb)

推荐答案

您可以使用 Stelios 建议的非常 Pythonic 的方式,或者使用一些最近添加到 SymPy 的功能:

You could either use the very Pythonic way suggested by Stelios, or use some recently added features to SymPy:

In [14]: from sympy.tensor.array import derive_by_array

In [15]: derive_by_array(lamb, (eta, xi))
Out[15]:
[-(z_1 - h_1(xi, eta))*Derivative(h_1(xi, eta), eta)/sigma**2 - (z_2 - h_2(xi,
 eta))*Derivative(h_2(xi, eta), eta)/sigma**2 - (z_3 - h_3(xi, eta))*Derivativ
e(h_3(xi, eta), eta)/sigma**2, -(z_1 - h_1(xi, eta))*Derivative(h_1(xi, eta), 
xi)/sigma**2 - (z_2 - h_2(xi, eta))*Derivative(h_2(xi, eta), xi)/sigma**2 - (z
_3 - h_3(xi, eta))*Derivative(h_3(xi, eta), xi)/sigma**2]

不幸的是,对于 N-dim 数组仍然缺少打印机,您可以通过将它们转换为列表来进行可视化(或者,使用 .tomatrix()):

Unfortunately the printer is still missing for N-dim arrays, you can visualize by converting them to a list (or, alternatively, using .tomatrix()):

In [16]: list(derive_by_array(lamb, (eta, xi)))
Out[16]: 
⎡                  ∂                              ∂                           
⎢  (z₁ - h₁(ξ, η))⋅──(h₁(ξ, η))   (z₂ - h₂(ξ, η))⋅──(h₂(ξ, η))   (z₃ - h₃(ξ, η
⎢                  ∂η                             ∂η                          
⎢- ──────────────────────────── - ──────────────────────────── - ─────────────
⎢                2                              2                             
⎣               σ                              σ                              

   ∂                               ∂                              ∂           
))⋅──(h₃(ξ, η))    (z₁ - h₁(ξ, η))⋅──(h₁(ξ, η))   (z₂ - h₂(ξ, η))⋅──(h₂(ξ, η))
   ∂η                              ∂ξ                             ∂ξ          
───────────────, - ──────────────────────────── - ────────────────────────────
 2                               2                              2             
σ                               σ                              σ              

                   ∂           ⎤
   (z₃ - h₃(ξ, η))⋅──(h₃(ξ, η))⎥
                   ∂ξ          ⎥
 - ────────────────────────────⎥
                 2             ⎥
                σ              ⎦

对于 Hessian,只需重复该过程两次:

For the Hessian, just repeat the procedure twice:

In [18]: list(derive_by_array(derive_by_array(lamb, (eta, xi)), (eta, xi)))
Out[18]: 
⎡                    2                               2                        
⎢                   ∂                               ∂                         
⎢  (z₁ - h₁(ξ, η))⋅───(h₁(ξ, η))   (z₂ - h₂(ξ, η))⋅───(h₂(ξ, η))   (z₃ - h₃(ξ,
⎢                    2                               2                        
⎢                  ∂η                              ∂η                         
⎢- ───────────────────────────── - ───────────────────────────── - ───────────
⎢                 2                               2                           
⎣                σ                               σ                            

       2                                                                      
      ∂                            2                 2                 2      
 η))⋅───(h₃(ξ, η))   ⎛∂           ⎞    ⎛∂           ⎞    ⎛∂           ⎞       
       2             ⎜──(h₁(ξ, η))⎟    ⎜──(h₂(ξ, η))⎟    ⎜──(h₃(ξ, η))⎟     (z
     ∂η              ⎝∂η          ⎠    ⎝∂η          ⎠    ⎝∂η          ⎠       
────────────────── + ─────────────── + ─────────────── + ───────────────, - ──
    2                        2                 2                 2            
   σ                        σ                 σ                 σ             


                 2                                 2                          
                ∂                                 ∂                           
₁ - h₁(ξ, η))⋅─────(h₁(ξ, η))   (z₂ - h₂(ξ, η))⋅─────(h₂(ξ, η))   (z₃ - h₃(ξ, 
              ∂ξ ∂η                             ∂ξ ∂η                         
───────────────────────────── - ─────────────────────────────── - ────────────
              2                                 2                             
             σ                                 σ                              


       2                                                                      
      ∂               ∂            ∂              ∂            ∂              
η))⋅─────(h₃(ξ, η))   ──(h₁(ξ, η))⋅──(h₁(ξ, η))   ──(h₂(ξ, η))⋅──(h₂(ξ, η))   
    ∂ξ ∂η             ∂η           ∂ξ             ∂η           ∂ξ             
─────────────────── + ───────────────────────── + ───────────────────────── + 
    2                              2                           2              
   σ                              σ                           σ               


                                                2                             
∂            ∂                                 ∂                              
──(h₃(ξ, η))⋅──(h₃(ξ, η))    (z₁ - h₁(ξ, η))⋅─────(h₁(ξ, η))   (z₂ - h₂(ξ, η))
∂η           ∂ξ                              ∂ξ ∂η                            
─────────────────────────, - ─────────────────────────────── - ───────────────
             2                               2                                
            σ                               σ                                 


    2                                 2                                       
   ∂                                 ∂               ∂            ∂           
⋅─────(h₂(ξ, η))   (z₃ - h₃(ξ, η))⋅─────(h₃(ξ, η))   ──(h₁(ξ, η))⋅──(h₁(ξ, η))
 ∂ξ ∂η                             ∂ξ ∂η             ∂η           ∂ξ          
──────────────── - ─────────────────────────────── + ─────────────────────────
 2                                 2                              2           
σ                                 σ                              σ            


                                                                             ∂
   ∂            ∂              ∂            ∂               (z₁ - h₁(ξ, η))⋅──
   ──(h₂(ξ, η))⋅──(h₂(ξ, η))   ──(h₃(ξ, η))⋅──(h₃(ξ, η))                      
   ∂η           ∂ξ             ∂η           ∂ξ                              ∂ξ
 + ───────────────────────── + ─────────────────────────, - ──────────────────
                2                           2                              2  
               σ                           σ                              σ   

2                               2                               2             
                               ∂                               ∂              
─(h₁(ξ, η))   (z₂ - h₂(ξ, η))⋅───(h₂(ξ, η))   (z₃ - h₃(ξ, η))⋅───(h₃(ξ, η))   
2                               2                               2             
                              ∂ξ                              ∂ξ              
─────────── - ───────────────────────────── - ───────────────────────────── + 
                             2                               2                
                            σ                               σ                 

                                                   ⎤
              2                 2                 2⎥
⎛∂           ⎞    ⎛∂           ⎞    ⎛∂           ⎞ ⎥
⎜──(h₁(ξ, η))⎟    ⎜──(h₂(ξ, η))⎟    ⎜──(h₃(ξ, η))⎟ ⎥
⎝∂ξ          ⎠    ⎝∂ξ          ⎠    ⎝∂ξ          ⎠ ⎥
─────────────── + ─────────────── + ───────────────⎥
        2                 2                 2      ⎥
       σ                 σ                 σ       ⎦

这篇关于如何获得梯度和Hessian |同情的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆