如何仅定义 Tensorflow 子图的梯度? [英] How Can I Define Only the Gradient for a Tensorflow Subgraph?

查看:38
本文介绍了如何仅定义 Tensorflow 子图的梯度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先:我只接触 Tensorflow 几天,所以请耐心等待.

First: I am only a few days in with Tensorflow, so please bear with me.

我从 cifar10 教程代码开始,现在使用卷积和特征值分解的组合来打破符号微分.IE.图形被构建,然后在调用 train() 时,脚本停止并显示没有为操作 [...] 定义梯度(操作类型:SelfAdjointEig)".毫不奇怪.

I started out from the cifar10 tutorial code and I am now using a combination of convolutions and eigenvalue decompositions that break the symbolic differentiation. I.e. the graph gets built, then upon calling train() the script halts with "No gradient defined for operation [...] (op type: SelfAdjointEig)". No surprise there.

有问题的子图的输入仍然只是输入特征图和正在使用的过滤器,我手头有梯度的公式,考虑到子图的输入和相对于其输出的梯度.

The inputs to the subgraph in question are still only the input feature maps and the filters being used, and I have the formulas for the gradients at hand and they should be straight-forward to implement given the inputs to the subgraph and the gradient with respect to its output.

从我在文档中看到的内容来看,我可以使用 RegisterGradient 或使用实验性的 gradient_override_map 覆盖它们.这两者都应该让我可以访问我需要的东西.例如,在 Github 上搜索 我发现很多示例以 op.input[0] 或诸如此类的形式访问操作的输入.

From what I can see in the docs, I can register a gradient method for custom Ops with RegisterGradient or override them with the experimental gradient_override_map. Both of those should give me access to exactly the things I need. For example, searching on Github I find a lot of examples that access the op's inputs as op.input[0] or such.

我遇到的问题是,我想从本质上缩短"整个子图,而不是单个操作,因此我没有要装饰的单个操作.由于这发生在 cifar 示例的卷积层之一中,我尝试使用该层的范围对象.从概念上讲,进入和退出该范围图的内容正是我想要的,所以如果我能以某种方式覆盖已经"做到的整个范围的渐变.

The problem I have is that I want to essentially "shortcut" a whole subgraph, not a single op, so I have no single op to decorate. Since this is happening in one of the convolutional layers of the cifar example I tried using the scope object for that layer. Conceptually, what enters and exits that scope's graph is exactly what I want so if I could somehow override the whole scope's gradients that would "already" do it.

我看到了 tf.Graph.create_op(我认为)我可以使用它来注册一种新类型的操作,然后我可以使用上述方法覆盖该操作类型的梯度计算.但是我没有看到一种方法来定义 op 的 forward 传递而不用 C++ 编写它......

I saw tf.Graph.create_op which (I think) I could use to register a new type of operation and I could then override that Operation type's gradient computation with aforementioned methods. But I don't see a way of defining that op's forward pass without writing it in C++...

也许我完全以错误的方式接近这个?由于我所有的前向或后向操作都可以用 python 接口实现,我显然想避免在 C++ 中实现任何东西.

Maybe I am approaching this the wrong way entirely? Since all of my forward or backward operations can be implemented with the python interface I obviously want to avoid implementing anything in C++.

推荐答案

这是 Sergey Ioffe 的一个技巧:

Here's a trick from Sergey Ioffe:

假设您希望一组操作在前向模式下表现为 f(x),但在后向模式下表现为 g(x).您将其实现为

Suppose you want group of ops that behave as f(x) in forward mode, but as g(x) in the backward mode. You implement it as

t = g(x)
y = t + tf.stop_gradient(f(x) - t)

因此,在您的情况下,您的 g(x) 可能是一个标识操作,使用 gradient_override_map

So in your case your g(x) could be an identity op, with a custom gradient using gradient_override_map

这篇关于如何仅定义 Tensorflow 子图的梯度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆