了解多层LSTM [英] Understanding multi-layer LSTM

查看:122
本文介绍了了解多层LSTM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试理解和实现多层LSTM.问题是我不知道它们如何连接.我有两个主意:

  1. 在每个时间步,第一个LSTM的隐藏状态H将成为第二个LSTM的输入.

  2. 在每个时间步,第一个LSTM的隐藏状态H将成为第二个LSTM的隐藏状态的初始值,第一个LSTM的输入将成为第二个LSTM的输入.

  3. p>

请帮助!

解决方案

TLDR:每个LSTM单元在时间t和级别l都有输入x(t)和隐藏状态h(l,t)在第一层中,输入是实际序列输入x(t)和先前的隐藏状态h(l,t-1),在下一层中,输入是先前层h中相应单元格的隐藏状态(l-1,t).

来自 https://arxiv.org/pdf/1710.02254.pdf :

增加GRU网络的容量(Hermans和Schrauwen 2013),循环层可以堆叠在彼此.由于GRU没有两个输出状态,因此相同的输出隐藏状态h'2被传递到下一个垂直层.换句话说,下一层的h1将等于h'2.这迫使GRU学习沿深度和时间有用的变换.

I'm trying to understand and implement multi-layer LSTM. The problem is i don't know how they connect. I'm having two thoughs in mind:

  1. At each timestep, the hidden state H of the first LSTM will become the input of the second LSTM.

  2. At each timestep, the hidden state H of the first LSTM will become the initial value for the hidden state of the sencond LSTM, and the input of the first LSTM will become the input for the second LSTM.

Please help!

解决方案

TLDR: Each LSTM cell at time t and level l has inputs x(t) and hidden state h(l,t) In the first layer, the input is the actual sequence input x(t), and previous hidden state h(l, t-1), and in the next layer the input is the hidden state of the corresponding cell in the previous layer h(l-1,t).

From https://arxiv.org/pdf/1710.02254.pdf:

To increase the capacity of GRU networks (Hermans and Schrauwen 2013), recurrent layers can be stacked on top of each other. Since GRU does not have two output states, the same output hidden state h'2 is passed to the next vertical layer. In other words, the h1 of the next layer will be equal to h'2. This forces GRU to learn transformations that are useful along depth as well as time.

这篇关于了解多层LSTM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆