如何在LSTMCell中应用层归一化 [英] How to apply Layer Normalisation in LSTMCell

查看:418
本文介绍了如何在LSTMCell中应用层归一化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在使用tf.compat.v1.nn.rnn_cell.LSTMCell时将层归一化应用于递归神经网络. /p>

有一个LayerNormalization类,但是我应该如何在LSTMCell中应用它. 我正在使用tf.compat.v1.nn.rnn_cell.LSTMCell,因为我想使用投影层.在这种情况下,我应该如何实现归一化.

class LM(tf.keras.Model):
  def __init__(self, hidden_size=2048, num_layers=2):
    super(LM, self).__init__()
    self.hidden_size = hidden_size
    self.num_layers = num_layers
    self.lstm_layers = []
    self.proj_dim = 640
    for i in range(self.num_layers):
        name1 = 'lm_lstm'+str(i)
        self.cell = tf.compat.v1.nn.rnn_cell.LSTMCell(2048, num_proj=640)
        self.lstm_layers.append(tf.keras.layers.RNN(self.cell, return_sequences=True, name=name1))

  def call(self, x):
    for i in range(self.num_layers):
      output = self.lstm_layers[i](x)
      x = output
    state_h = ""
    return x, state_h

解决方案

这取决于您是要在单元级还是在层级应用规范化-我不确定哪种方法是正确的方法-纸张未指定. 此处是一个较旧的实现,您可能会从中获得启发.

要在单元格级别进行标准化,您可能需要创建 LayerNormalization 应用于例如,RNN如下图所示,但是您需要仔细考虑它是否具有理想的效果,尤其是考虑到序列模型固有的可变形状时.

self.lstm_layers.append(tf.keras.layers.RNN(self.cell, return_sequences=True, name=name1))
self.lstm_layers.append(tf.keras.layers.LayerNormalization())

I want to apply Layer Normalisation to recurrent neural network while using tf.compat.v1.nn.rnn_cell.LSTMCell.

There is a LayerNormalization class but how should I apply this in LSTMCell. I am using tf.compat.v1.nn.rnn_cell.LSTMCell because I want to use projection layer. How should I achieve Normalisation in this case.

class LM(tf.keras.Model):
  def __init__(self, hidden_size=2048, num_layers=2):
    super(LM, self).__init__()
    self.hidden_size = hidden_size
    self.num_layers = num_layers
    self.lstm_layers = []
    self.proj_dim = 640
    for i in range(self.num_layers):
        name1 = 'lm_lstm'+str(i)
        self.cell = tf.compat.v1.nn.rnn_cell.LSTMCell(2048, num_proj=640)
        self.lstm_layers.append(tf.keras.layers.RNN(self.cell, return_sequences=True, name=name1))

  def call(self, x):
    for i in range(self.num_layers):
      output = self.lstm_layers[i](x)
      x = output
    state_h = ""
    return x, state_h

解决方案

It depends whether you want to apply the normalization at cell level or at layer level - I'm not sure which one is the correct way to do it - the paper doesn't specify it. Here is an older implementation that you might use for inspiration.

To normalize at cell level, you probably need to create a custom RNNCell and implement the normalization there.

P.S. You might also be able to apply LayerNormalization to the output of RNN, for example like shown below, but you'll need to think carefully about whether it has the desired effect, especially given the variable shapes inherent to sequence models.

self.lstm_layers.append(tf.keras.layers.RNN(self.cell, return_sequences=True, name=name1))
self.lstm_layers.append(tf.keras.layers.LayerNormalization())

这篇关于如何在LSTMCell中应用层归一化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆