RNN:仅在最后一个时间步进行输出时的时间反向传播 [英] RNN: Back-propagation through time when output is taken only at final timestep

查看:445
本文介绍了RNN:仅在最后一个时间步进行输出时的时间反向传播的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在此 Denny Britz撰写的有关递归神经网络的博客. 作者指出,"上面的图在每个时间步都有输出,但是根据任务的不同,可能没有必要.例如,在预测句子的情感时,我们可能只关心最终的输出,而不是最终的输出.每个词后面都有情感.类似地,我们可能不需要在每个时间步都输入信息."

In this blog on Recurrent Neural Networks by Denny Britz. Author states that, "The above diagram has outputs at each time step, but depending on the task this may not be necessary. For example, when predicting the sentiment of a sentence we may only care about the final output, not the sentiment after each word. Similarly, we may not need inputs at each time step."

在仅在最后一个时间步获取输出的情况下:反向传播将如何变化,如果每个时间步都没有输出,则仅最后一个输出?我们需要在每个时间步长定义损失,但是如何在没有输出的情况下做到呢?

In the case when we take output only at the final timestep: How will backpropogation change, if there are no outputs at each time step, only the final one? We need to define loss at each time step, but how to do it without outputs?

推荐答案

这不是正确的,因为您需要在每个时间步长定义输出",实际上,随着时间的向后传播更简单,并且只有一个输出而不是图片上的那个当只有一个输出时,只需将网络旋转90度"即可,这将是一个常规的前馈网络(简单地,有些信号会直接进入隐藏层)-反向传播通常会起作用,将偏导数推入整个系统.当我们在每一步都有输出时,这将变得更加棘手,通常您将真实损失定义为所有小损失"的,因此必须对所有梯度进行求和.

This is not true that you "need to define output at each timestep", actually backpropagation through time is simpler with a single output than the one on the image. When there is just one output simply "rotate your network 90 degrees" and it will be a regular feed forward network (simply with some signals coming into hidden layers directly) - backpropagation works as usually, pushing the partial derivatives through the system. When we have outputs at each step, this becomes more tricky and usually you define true loss to be sum of all the "small losses" and consequently you have to sum all the gradients.

这篇关于RNN:仅在最后一个时间步进行输出时的时间反向传播的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆