无法理解Tensorflow keras层(tf.keras.layers.Layer)中方法`build`的行为 [英] Unable to understand the behavior of method `build` in tensorflow keras layers (tf.keras.layers.Layer)

查看:141
本文介绍了无法理解Tensorflow keras层(tf.keras.layers.Layer)中方法`build`的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

tensorflow keras中的图层具有方法build,该方法用于将权重创建推迟到您看到输入将是什么的时候. 图层的构建方法

Layers in tensorflow keras have a method build that is used to defer the weights creation to a time when you have seen what the input is going to be. a layer's build method

我有几个问题无法找到答案:

I have a few questions i have not been able to find the answer of:

  1. 在这里据说

如果将Layer实例分配为另一个Layer的属性,则外层将开始跟踪内层的权重.

If you assign a Layer instance as attribute of another Layer, the outer layer will start tracking the weights of the inner layer.

跟踪图层的权重是什么意思?

What does it mean to track the weights of a layer?

  1. 同一链接还提到
  1. The same link also mentions that

我们建议在 init 方法中创建此类子层(由于子层通常具有构建方法,因此将在构建外层时构建它们).

We recommend creating such sublayers in the init method (since the sublayers will typically have a build method, they will be built when the outer layer gets built).

这是否意味着在运行子类(自身)的build方法时,将遍历self的所有属性,而从tf.keras.layer.Layer的实例中发现的子类中的任何一个都将具有他们的build方法自动运行吗?

Does it mean that while running the build method of child class (self), there will an iteration through all the attributes of self and whichever are found to be subclassed from (instances of) tf.keras.layer.Layer will have their build methods run automatically?

  1. 我可以运行以下代码:

class Net(tf.keras.Model):
  """A simple linear model."""

  def __init__(self):
    super(Net, self).__init__()
    self.l1 = tf.keras.layers.Dense(5)
  def call(self, x):
    return self.l1(x)

net = Net()
print(net.variables)

但不是这样:

class Net(tf.keras.Model):
  """A simple linear model."""

  def __init__(self):
    super(Net, self).__init__()
    self.l1 = tf.keras.layers.Dense(5)
  def build(self,input_shape):
    super().build()
  def call(self, x):
    return self.l1(x)

net = Net()
print(net.variables)

为什么?

推荐答案

例如,当您构建自定义的tf.keras.Model时,我会说 build 的意思是

I would say the build mentioned means, when you build a self-defined tf.keras.Model for example

net = Net()

然后将获得在__init__中创建的所有tf.keras.layers.Layer对象,并将其存储在作为可调用对象的net中.在这种情况下,它将成为TF稍后训练的完整对象,这就是要跟踪的.下次致电net(inputs)时,您将获得输出.

then you will get all the tf.keras.layers.Layer objects create in __init__, and being stored in net which is a callable object. In this case, it will become a completed object for TF to train later, this is what it said to track. The next time you call net(inputs) you'll can get your outputs.

以下是Tensorflow自定义解码器的一个示例

Here is a example of Tensorflow self-defined decoder with attention

class BahdanauAttention(tf.keras.layers.Layer):
  def __init__(self, units):
    super(BahdanauAttention, self).__init__()
    self.W1 = tf.keras.layers.Dense(units)
    self.W2 = tf.keras.layers.Dense(units)
    self.V = tf.keras.layers.Dense(1)

  def call(self, query, values):
    # query hidden state shape == (batch_size, hidden size)
    # query_with_time_axis shape == (batch_size, 1, hidden size)
    # values shape == (batch_size, max_len, hidden size)
    # we are doing this to broadcast addition along the time axis to calculate the score
    query_with_time_axis = tf.expand_dims(query, 1)

    # score shape == (batch_size, max_length, 1)
    # we get 1 at the last axis because we are applying score to self.V
    # the shape of the tensor before applying self.V is (batch_size, max_length, units)
    score = self.V(tf.nn.tanh(
        self.W1(query_with_time_axis) + self.W2(values)))

    # attention_weights shape == (batch_size, max_length, 1)
    attention_weights = tf.nn.softmax(score, axis=1)

    # context_vector shape after sum == (batch_size, hidden_size)
    context_vector = attention_weights * values
    context_vector = tf.reduce_sum(context_vector, axis=1)

    return context_vector, attention_weights

class Decoder(tf.keras.Model):
  def __init__(self, vocab_size, embedding_dim, dec_units, batch_sz):
    super(Decoder, self).__init__()
    self.batch_sz = batch_sz
    self.dec_units = dec_units
    self.embedding = tf.keras.layers.Embedding(vocab_size, embedding_dim)
    self.gru = tf.keras.layers.GRU(self.dec_units,
                                   return_sequences=True,
                                   return_state=True,
                                   recurrent_initializer='glorot_uniform')
    self.fc = tf.keras.layers.Dense(vocab_size)

    # used for attention
    self.attention = BahdanauAttention(self.dec_units)

  def call(self, x, hidden, enc_output):
    # enc_output shape == (batch_size, max_length, hidden_size)
    context_vector, attention_weights = self.attention(hidden, enc_output)

    # x shape after passing through embedding == (batch_size, 1, embedding_dim)
    x = self.embedding(x)

    # x shape after concatenation == (batch_size, 1, embedding_dim + hidden_size)
    x = tf.concat([tf.expand_dims(context_vector, 1), x], axis=-1)

    # passing the concatenated vector to the GRU
    output, state = self.gru(x)

    # output shape == (batch_size * 1, hidden_size)
    output = tf.reshape(output, (-1, output.shape[2]))

    # output shape == (batch_size, vocab)
    x = self.fc(output)

    return x, state, attention_weights

我试图将tf.keras.layers.Layer对象放入call并获得非常差的结果,这是因为如果将其放入call,则它将被调用多次,而每次发生向前-向后传播

I have tried to put tf.keras.layers.Layer object in call and got really poor outcome, guess that was because if you put it in call then it will be call multiple times while each time a forward-backward propagation happends.

这篇关于无法理解Tensorflow keras层(tf.keras.layers.Layer)中方法`build`的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆