如何确保您的计算图是可微的 [英] How to make sure your computation graph is differentiable

查看:81
本文介绍了如何确保您的计算图是可微的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一些Tensorflow运算(例如tf.argmax)是不可区分的(即没有计算出任何梯度并将其用于反向传播).

Some of the Tensorflow operations (e.g. tf.argmax) are not differentiable (i.e. no gradients are calculated and used in back-propagation).

Tensorflow的答案建议不要在Tensorflow代码中搜索RegisterGradient.我还注意到Tensorflow有一个tf.NotDifferentiable API调用,用于声明操作是不可微分的.

An answer to Tensorflow what operations are differentiable and what are not? suggests searching for RegisterGradient in the Tensorflow code. I also noticed Tensorflow has a tf.NotDifferentiable API call for declaring an operation to be non-differentiable.

如果我使用不可微分函数,是否会发出警告? 有没有一种编程的方法来确保我的整个计算图是可微的?

Is there a warning issued if I use non-differentiable functions? Is there a programmatic way to ensure that my entire computation graph is differentiable?

推荐答案

大多数浮点运算都将具有渐变,因此第一个通过的答案就是检查图中是否没有int32/int64 dtype张量.这很容易做到,但可能没有用(即,任何非平凡的模型都将执行不可微分的索引操作).

Most floating point operations will have gradients, so a first pass answer would just be to check that there are no int32/int64 dtype Tensors in the graph. This is easy to do, but probably not useful (i.e. any non-trivial model will be doing non-differentiable indexing operations).

您可以进行某种类型的自省,遍历GraphDef中的操作并检查是否已注册渐变.我认为这也不是非常有用.如果我们不信任渐变是首先注册的,为什么还要信任渐变是否正确?

You could do some type of introspection, looping over the operations in the GraphDef and checking that they have gradients registered. I would argue that this is not terribly useful either; if we don't trust that gradients are registered in the first place, why trust that they're correct if registered?

相反,我将在几个点上为您的模型进行数值梯度检查.例如,假设我们注册了没有渐变的PyFunc:

Instead, I'd go with numerical gradient checking at a few points for your model. For example, let's say we register a PyFunc without a gradient:

import tensorflow as tf
import numpy
def my_func(x):
  return numpy.sinh(x)
with tf.Graph().as_default():
  inp = tf.placeholder(tf.float32)
  y = tf.py_func(my_func, [inp], tf.float32) + inp
  grad, = tf.gradients(y, inp)
  with tf.Session() as session:
    print(session.run([y, grad], feed_dict={inp: 3}))
    print("Gradient error:", tf.test.compute_gradient_error(inp, [], y, []))

这使我得到如下输出:

[13.017875, 1.0]
Gradient error: 1.10916996002

数字梯度可能有点棘手,但是通常任何比机器epsilon(float32约为1e-7)大几个数量级的梯度误差都会为我带来平稳的功能.

Numerical gradients can be a bit tricky, but generally any gradient error which is more than a few orders of magnitude more than the machine epsilon (~1e-7 for float32) would raise red flags for me for a supposedly smooth function.

这篇关于如何确保您的计算图是可微的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆