如何在TensorFlow中的图中使用循环构建简单的RNN? [英] How to build a simple RNN with a cycle in the graph in TensorFlow?

查看:104
本文介绍了如何在TensorFlow中的图中使用循环构建简单的RNN?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚开始使用TensorFlow,我正在尝试实现一个非常简单的RNN. RNN将x作为输入,将y作为输出,并且仅包含一个采用x并将其先前的输出作为输入的图层.这是我所想到的事情的图片:

问题是,我无法通过TensorFlow API来构建带有循环的图形.每当我定义张量时,我都必须指定它的输入是什么,这意味着我必须已经定义了它的输入.因此存在一个鸡与蛋的问题.

我什至不知道要定义一个带有周期的图是否有意义(首先计算什么?我必须定义softmax节点的初始值吗?).我曾想过使用一个变量来表示先前的输出,然后在每次输入训练样本后,手动获取y的值并将其存储在变量中的想法.但这将非常缓慢,除非有一种方法可以在图形本身中表示此过程(?).

我知道TensorFlow教程显示了RNN的示例实现,但是它们欺骗了LSTM模块并将其从已经包含循环的库中拉出.总体而言,这些教程非常适合逐步引导您完成某些事情,但是它们可以更好地解释这种野兽的工作原理.

那么,TensorFlow专家,有没有办法构建这个东西?我将如何去做呢?

解决方案

事实上,所有机器学习框架中的正向和反向传递都假定您的网络没有周期.实现循环网络的一种常见方法是及时将其展开几个步骤(例如50个),因此将具有环路的网络转换为不具有环路的网络.

例如,在您所指的文档中:

https://www.tensorflow.org/versions/r0.7/tutorials/recurrent/index.html

他们提到

为了使学习过程易于处理,通常的做法是将用于反向传播的梯度截断为固定数量(num_steps)展开的步骤.

这实际上意味着它们将创建num_steps个LSTM单元,其中每个单元将当前时间步长的值x作为输入,以及上一个模块的输出.

他们使用的并且您认为有循环的BasicLSTMCell实际上没有循环. LSTM单元只是一个LSTM步骤(一个具有两个输入[输入和内存]和两个输出[输出和内存],并使用门计算输入的输出的块)的实现,而不是整个LSTM网络. /p>

I've just started playing with TensorFlow and I'm trying to implement a very simple RNN. The RNN has x as input, y as output and consists of just a single layer that takes x and it's previous output as input. Here's a picture of the sort of thing I have in mind:

The problem is, I can't see any way through the TensorFlow API to construct a graph with a cycle in it. Whenever I define a Tensor I have to specify what it's inputs are, which means I have to have already have defined it's inputs. So there's a chicken-and-egg problem.

I don't even know if it makes sense to want to define a graph with a cycle (What gets computed first? Would I have to define an initial value of the softmax node?). I played with the idea of using a variable to represent the previous output and then manually take the value of y and store it in the variable every time after feeding through a training sample. But that would be very slow unless there's a way to represent this procedure in the graph itself (?).

I know the TensorFlow tutorials show example implementations of RNNs but they cheat and pull an LSTM module out of the library which already has the cycle in it. Overall the tutorials are good for stepping you through how to build certain things but they could do a better job of explaining how this beast really works.

So, TensorFlow experts, is there a way to build this thing? How would I go about doing it?

解决方案

As a matter of fact, both the forward and backward pass in all the machine learning frameworks assume that your network does not have cycles. A common way of implementing a recurrent network is unrolling it in time for several steps (say 50), and therefore converting a network that has loops into one that does not have any.

For instance, in the docs you are referring to:

https://www.tensorflow.org/versions/r0.7/tutorials/recurrent/index.html

They mention

In order to make the learning process tractable, it is a common practice to truncate the gradients for backpropagation to a fixed number (num_steps) of unrolled steps.

What it effectively means is that they will create num_steps LSTM cells, where each takes as an input the value x for the current timestep, and the output of the previous LSTM module.

The BasicLSTMCell that they use and that you think has a loop in fact does not have a loop. An LSTM cell is just an implementation of a single LSTM step (a block that has two inputs [input and memory] and two outputs [output and memory], and uses gates to compute outputs from inputs), not the entire LSTM network.

这篇关于如何在TensorFlow中的图中使用循环构建简单的RNN?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆