Tensorflow:如何在python中编写带有渐变的op? [英] Tensorflow: How to write op with gradient in python?

查看：128 发布时间：2020/5/17 19:10:03 python tensorflow neural-network gradient-descent

本文介绍了Tensorflow:如何在python中编写带有渐变的op?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想在python中编写一个TensorFlow op，但是我希望它是可区分的(能够计算梯度).

I would like to write a TensorFlow op in python, but I would like it to be differentiable (to be able to compute a gradient).

此问题询问如何在python中编写op，并且答案建议使用py_func(不具有渐变):

This question asks how to write an op in python, and the answer suggests using py_func (which has no gradient): Tensorflow: Writing an Op in Python

TF文档描述了如何仅从C ++代码开始添加操作:

The TF documentation describes how to add an op starting from C++ code only: https://www.tensorflow.org/versions/r0.10/how_tos/adding_an_op/index.html

就我而言，我正在制作原型，所以我不在乎它是否可以在GPU上运行，也不在乎它是否可以从TF python API之外的任何其他功能使用.

In my case, I am prototyping so I don't care about whether it runs on GPU, and I don't care about it being usable from anything other than the TF python API.

推荐答案

是的，正如@Yaroslav的答案中提到的那样，有可能，而且关键是他引用的链接:此处.我想通过给出一个具体的例子来详细说明这个答案.

Yes, as mentionned in @Yaroslav's answer, it is possible and the key is the links he references: here and here. I want to elaborate on this answer by giving a concret example.

模运算:让我们在tensorflow中实现逐元素的模运算(它已经存在，但尚未定义其梯度，但例如，我们将从头开始实现它).

Modulo opperation: Let's implement the element-wise modulo operation in tensorflow (it already exists but its gradient is not defined, but for the example we will implement it from scratch).

Numpy函数:第一步是定义所需的numpy数组的运算.元素方式的模运算已经在numpy中实现，因此很容易:

Numpy function: The first step is to define the opperation we want for numpy arrays. The element-wise modulo opperation is already implemented in numpy so it is easy:

import numpy as np
def np_mod(x,y):
    return (x % y).astype(np.float32)

.astype(np.float32)的原因是因为默认情况下，张量流采用float32类型，如果将其赋予float64(默认为numpy)，它将报错.

The reason for the .astype(np.float32) is because by default tensorflow takes float32 types and if you give it float64 (the numpy default) it will complain.

梯度函数:接下来，我们需要为运算的每个输入定义运算的梯度函数，作为张量流函数.该函数需要采用非常特定的形式.它需要获取运算op的张量流表示形式和输出grad的梯度，并说明如何传播这些梯度.在我们的情况下，mod操作的梯度很容易，相对于第一个参数，导数为1，并且，但是让我们忽略它，请参见https://math.stackexchange.com/questions/1849280/derivative-of-remainder-function-wrt-denominator 了解详情) .所以我们有

Gradient Function: Next we need to define the gradient function for our opperation for each input of the opperation as tensorflow function. The function needs to take a very specific form. It need to take the tensorflow representation of the opperation op and the gradient of the output grad and say how to propagate the gradients. In our case, the gradients of the mod opperation are easy, the derivative is 1 with respect to the first argument and with respect to the second (almost everywhere, and infinite at a finite number of spots, but let's ignore that, see https://math.stackexchange.com/questions/1849280/derivative-of-remainder-function-wrt-denominator for details). So we have

def modgrad(op, grad):
    x = op.inputs[0] # the first argument (normally you need those to calculate the gradient, like the gradient of x^2 is 2x. )
    y = op.inputs[1] # the second argument

    return grad * 1, grad * tf.neg(tf.floordiv(x, y)) #the propagated gradient with respect to the first and second argument respectively

grad函数需要返回一个n元组，其中n是该操作的自变量数.注意，我们需要返回输入的张量流函数.

The grad function needs to return an n-tuple where n is the number of arguments of the operation. Notice that we need to return tensorflow functions of the input.

制作带有渐变的TF函数:如上文所述，使用tf.RegisterGradient

Making a TF function with gradients: As explained in the sources mentioned above, there is a hack to define gradients of a function using tf.RegisterGradient [doc] and tf.Graph.gradient_override_map [doc].

复制 harpone 中的代码，我们可以修改tf.py_func函数以使其定义同时渐变:

Copying the code from harpone we can modify the tf.py_func function to make it define the gradient at the same time:

import tensorflow as tf

def py_func(func, inp, Tout, stateful=True, name=None, grad=None):

    # Need to generate a unique name to avoid duplicates:
    rnd_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))

    tf.RegisterGradient(rnd_name)(grad)  # see _MySquareGrad for grad example
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)

stateful选项是告诉tensorflow函数是否总是为相同的输入提供相同的输出(有状态= False)，在这种情况下tensorflow可以简单地表示tensorflow图，这是我们的情况，并且可能会是这种情况大多数情况下.

The stateful option is to tell tensorflow whether the function always gives the same output for the same input (stateful = False) in which case tensorflow can simply the tensorflow graph, this is our case and will probably be the case in most situations.

将它们组合在一起:现在我们已经拥有了所有的部件，我们可以将它们全部组合在一起:

Combining it all together: Now that we have all the pieces, we can combine them all together:

from tensorflow.python.framework import ops

def tf_mod(x,y, name=None):

    with ops.op_scope([x,y], name, "mod") as name:
        z = py_func(np_mod,
                        [x,y],
                        [tf.float32],
                        name=name,
                        grad=modgrad)  # <-- here's the call to the gradient
        return z[0]

tf.py_func作用于张量列表(并返回张量列表)，这就是为什么我们有[x,y](并返回z[0])的原因. 现在我们完成了.我们可以对其进行测试.

tf.py_func acts on lists of tensors (and returns a list of tensors), that is why we have [x,y] (and return z[0]). And now we are done. And we can test it.

测试:

with tf.Session() as sess:

    x = tf.constant([0.3,0.7,1.2,1.7])
    y = tf.constant([0.2,0.5,1.0,2.9])
    z = tf_mod(x,y)
    gr = tf.gradients(z, [x,y])
    tf.initialize_all_variables().run()

    print(x.eval(), y.eval(),z.eval(), gr[0].eval(), gr[1].eval())

[0.30000001 0.69999999 1.20000005 1.70000005] [0.2 0.5 1. 2.9000001] [0.10000001 0.19999999 0.20000005 1.70000005] [1. 1. 1. 1.] [-1. -1. -1. 0.]

[ 0.30000001 0.69999999 1.20000005 1.70000005] [ 0.2 0.5 1. 2.9000001] [ 0.10000001 0.19999999 0.20000005 1.70000005] [ 1. 1. 1. 1.] [ -1. -1. -1. 0.]

成功！

这篇关于Tensorflow:如何在python中编写带有渐变的op?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Tensorflow:如何在python中编写带有渐变的op? [英] Tensorflow: How to write op with gradient in python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Tensorflow:如何在python中编写带有渐变的op? [英] Tensorflow: How to write op with gradient in python?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭