Keras 中的动态 RNN:使用自定义 RNN 单元在每个时间步跟踪其他输出 [英] Dynamic RNN in Keras: Use Custom RNN Cell to Track Other Outputs at Each Timestep

查看:28
本文介绍了Keras 中的动态 RNN:使用自定义 RNN 单元在每个时间步跟踪其他输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 keras 中为 RNN 实现自定义单元时,有没有办法在给定的时间步长内返回多个输出?例如.具有形状的输出:(sequences=[batch, timesteps, hidden_​​units], other_outputs=[batch, timesteps,任意_units], last_hidden_​​states=[batch, hidden_​​units])

Is there a way to return multiple outputs for a given timestep when implementing a custom cell for an RNN in keras? E.g. outputs with shapes: (sequences=[batch, timesteps, hidden_units], other_outputs=[batch, timesteps, arbitrary_units], last_hidden_states=[batch, hidden_units])

我这样做的动机源于 的算法 1循环解码器"用于总结的变分顺序学习中的自我注意,它累积变分目标",因此必须跟踪给定循环时间步长的多个输出.

My motivation for this stems from Algorithm 1 'recurrent decoder' of Self Attention in Variational Sequential Learning for Summarization which 'accumulates the variational objective' and thus must track several outputs for a given recurrent timestep.

使用 keras RNN,如果在实例化层时传递 return_sequences=Truereturn_state=True 参数,则前向传递通过 RNN 的输出为 ([batch, timesteps, hidden_​​units], [batch, hidden_​​units]) 分别是所有时间步的隐藏状态和最后一个隐藏状态.我想在每个时间步使用 RNN 跟踪其他输出,但我不确定如何.我想我可以更改自定义单元格中的 output_size 属性,类,但我不确定这是否有效,因为 TensorFlow RNN 文档似乎表明每个时间步只能有一个输出(即,'单个整数或 TensorShape'):

With a keras RNN, if you pass the return_sequences=True and return_state=True args when instantiating the layer, the outputs from a forward pass through the RNN are ([batch, timesteps, hidden_units], [batch, hidden_units]) which are hidden states at all timesteps and the last hidden state, respectively. I want to track other outputs at each timestep using the RNN, but I am not sure how. I am thinking I could change the output_size attribute in the custom cell, class but I am not certain this is valid since the TensorFlow RNN documentation seems to indicate only a single output is possible for each timestep (i.e., 'single integer or TensorShape'):

一个 output_size 属性.这可以是单个整数或TensorShape,表示输出的形状.对于落后兼容原因,如果该属性不可用于单元格,该值将由 state_size 的第一个元素推断.

A output_size attribute. This can be a single integer or a TensorShape, which represent the shape of the output. For backward compatible reason, if this attribute is not available for the cell, the value will be inferred by the first element of the state_size.

到目前为止,这是我为自定义实现的RNN 单元"所做的:

This is what I have for a custom implemented 'RNN cell' so far:

class CustomGRUCell(tf.keras.layers.Layer):
    def __init__(self, units, arbitrary_units, **kwargs):
        super().__init__(**kwargs)

        self.units = units

        # Custom computation for a timestep t
        self.dense = tf.keras.layers.Dense(units=arbitrary_units)

        # The RNN cell
        self.gru = tf.keras.layers.GRUCell(units=self.units)

        # Required for custom cells...
        self.state_size = tf.TensorShape([self.units])

        # PERHAPS I CHANGE THIS????
        self.output_size = tf.TensorShape([self.units])

    def call(self, input_at_t, states_at_t):
        """Forward pass that uses a constant to modify the hidden state.
      
        :param inputs_at_t: (batch, features) tensor from (batch, t, features)
            inputs
        :param states_at_t: <class 'tuple'> Why? Perhaps generically,
            this is because an LSTM for example takes two hidden states
            instead of just one like the GRU
        :param constants: <class 'tuple'> Why? To accomodate multiple
            constants
        """

        # Standard GRU cell call
        output_at_t, states_at_t_plus_1 = self.gru(input_at_t, states_at_t)

        # Another output at particular timestep t
        special_output_at_t = self.dense(input_at_t)

        # The outputs
        # 'output_at_t' will be automatically tracked by 'return_sequences'.... how do I track
        # other comptuations at each timestep????
        return [output_at_t, special_output_at_t], states_at_t_plus_1

然后我希望单元格像这样工作:

Then I want the cell to work like this:

# Custom cell and rnn
custom_cell = CustomGRUCell(units=10, arbitrary_units=5)
custom_rnn = tf.keras.layers.RNN(cell=custom_cell, return_sequences=True, return_state=True)

# Arbitrary data
batch = 4
timesteps = 6
features = 8
dummy_data = tf.random.normal(shape=(batch, timesteps, features))

# The output I want
seqs, special_seqs, last_hidden_state = custom_rnn(inputs=dummy_data)

print('batch, timesteps, units):', seqs.shape)
print('batch, timesteps, arbitrary_units:', special_seqs.shape)
print('batch, units:', last_hidden_state.shape)

>>> batch, timesteps, units : (4, 6, 10) 
>>> batch, timesteps, arbitrary_units: (4, 6, 5)
>>> batch, units: (4, 10)

推荐答案

想通了.您可以将输出大小设为任意维度的列表,然后 RNN 将跟踪输出.下面的类还包括在 RNN 调用中使用常量,因为前面提到的论文将编码器潜在空间 (z_enc) 传递给循环解码器:

Figured it out. You can just make the output size a list with any dimensions and then the RNN will track the outputs. The class below also includes the use of constants in the RNN call because the previously mentioned paper passes an encoder latent space (z_enc) to the recurrent decoder:

class CustomMultiTimeStepGRUCell(tf.keras.layers.Layer):
    """Illustrates multiple sequence like (n, timestep, size) outputs."""

    def __init__(self, units, arbitrary_units, **kwargs):
        """Defines state for custom cell.
        
        :param units: <class 'int'> Hidden units for the RNN cell.
        :param arbitrary_units: <class 'int'> Hidden units for another
            dense network that outputs a tensor at each timestep in the
            unrolling of the RNN.
        """

        super().__init__(**kwargs)

        # Save args
        self.units = units
        self.arbitrary_units = arbitrary_units

        # Standard recurrent cell
        self.gru = tf.keras.layers.GRUCell(units=self.units)

        # For use with 'constant' kwarg in 'call' method
        self.concatenate = tf.keras.layers.Concatenate()
        self.dense_proj = tf.keras.layers.Dense(units=self.units)

        # For arbitrary computation at timestep t
        self.other_output = tf.keras.layers.Dense(units=self.arbitrary_units)

        # Hidden state size (i.e., h_t)...
        # it's useful to know in general that this refers to the following:
        # 'gru_cell = tf.keras.GRUCell(units=state_size)' 
        # 'seq, h_t = gru_cell(data)'
        # 'h_t.shape' -> '(?, state_size)'
        self.state_size = tf.TensorShape([self.units])

        # OUTPUT SIZE: PROBLEM SOLVED!!!!
        # This is the last dimension of the RNN sequence output.
        # Typically the last dimension matches the dimension of 
        # self.state_size, and in fact the keras RNN will infer 
        # the output size based on state size if output size is not
        # specified. In the case of output size that does not match the 
        # state size, you have to specify and in list format if 
        # multiple outputs can occur per timestep in the RNN.
        self.output_size = [tf.TensorShape([self.units]), tf.TensorShape([self.arbitrary_units])]

    def call(self, input_at_t, states_at_t, constants):
        """Forward pass for custom RNN cell.
        
        :param inputs_at_t: (batch, features) tensor from (batch, t, features)
            inputs
        :param states_at_t: <class 'tuple'> that has 1 element if
            if using GRUCell (h_t), or 2 elements if using LSTMCell (h_t, c_t)
        :param constants: <class 'tuple'> Unchanging tensors to be used
            in the unrolling of the RNN.

        :return: <class 'tuple'> with two elements.
            (1) <class 'list'> Both elements of this list are tensors
            that are tracked for each timestep in the unrolling of the RNN. 
            (2) Tensor representing the hidden state passed to the next
            cell.

            In the brief graphic below, a_t denotes the arbitrary output
            at each timestep. y_t = h_t_plus_1. x_t is some input at
            timestep t.

                    a_t  y_t
                     ^    ^
                   __|____|
            h_t    |      | h_t_plus_1
            -----> |      | ----------> .....
                   |______|
                      ^
                      |
                     x_t
         
            When all timesteps in x where x = {x_t}_{t=1}^{T} are processed
            by the RNN, the resulting shapes of the outputs assuming there 
            is only a single sample (batch = 1) would be the following:
            Y = (1, timesteps, units)
            A = (1, timesteps, arbitrary_units)
            h_t_plus_1 = (1, units)  # Last hidden state
            
            For a concrete example, see the end of this codeblock.
        """

        # Get correct inputs -- by default these args are tuples...
        # so you must index 0 to get the relevant element.
        # Note, if you are using LSTM, then the hidden states passed to the
        # the next cell in the RNN will be a tuple with two elements
        # i.e., (h_t, c_t) for the hidden and cell state, respectively.
        states_at_t = states_at_t[0]
        z_enc = constants[0]

        # Combine the states with z_enc
        combined = self.concatenate([states_at_t, z_enc])

        # Project to dimensions for GRU cell
        special_states_at_t = self.dense_proj(combined)

        # Standard GRU call
        output_at_t, states_at_t_plus_1 = self.gru(input_at_t, special_states_at_t)

        # Get another output at t
        arbitrary_output_at_t = self.other_output(input_at_t)

        # The outputs
        return [output_at_t, arbitrary_output_at_t], states_at_t_plus_1

# Dims
batch = 4
timesteps = 3
features = 12
latent = 8
hidden_units = 10
arbitary_units = 15

# Data
inputs = tf.random.normal(shape=(batch, timesteps, features))
h_t = tf.zeros(shape=(batch, hidden_units))
z_enc = tf.random.normal(shape=(batch, latent))

# An RNN cell to test multitimestep outputs
custom_multistep_cell = CustomMultiTimeStepGRUCell(units=hidden_units, arbitrary_units=arbitary_units)
custom_multistep_rnn = tf.keras.layers.RNN(custom_multistep_cell, return_sequences=True, return_state=True)

# Call cell
outputs, special_outputs, last_hidden = custom_multistep_rnn(inputs, initial_state=h_t, constants=z_enc)
print(outputs.shape)
print(special_outputs.shape)
print(last_hidden.shape)

>>> (4, 3, 10)
>>> (4, 3, 15)
>>> (4, 10)

这篇关于Keras 中的动态 RNN:使用自定义 RNN 单元在每个时间步跟踪其他输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆