Keras模型中的Softmax采样 [英] Sampled Softmax in Keras Model

查看:154
本文介绍了Keras模型中的Softmax采样的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我考虑过的一些方法:

从Model类继承 在Tensorflow keras中采样的softmax

Inheriting from Model class Sampled softmax in tensorflow keras

从Layers类继承 如何使用TensorFlow的采样Keras模型中的softmax损失函数?

Inheriting from Layers class How can I use TensorFlow's sampled softmax loss function in a Keras model?

在这两种方法中,模型"方法更为简洁,因为分层"方法有点笨拙-它将目标作为输入的一部分推入,然后再通过多输出模型.

Of the two approaches the Model approach is cleaner, as the layers approach is a little hacky - it pushes in the target as part of the input and then bye bye multi-output models.

在对Model类进行子类化时,我需要一些帮助-具体来说: 1)与第一种方法不同-我想像指定标准keras模型那样采用任意数量的层.例如,

I'd like some help in subclassing the Model class - Specifically: 1) Unlike the first approach - I would like to take in any number of layers as we do in specifying a standard keras model. For example,

class LanguageModel(tf.keras.Model):
    def __init__(self, **kwargs)

2)我希望将以下代码合并到模型类中-但要让模型类认识到

2)I am looking to incorporate within the model class the below code -but want to let the Model class recognize that

def call(self, y_true, input):
        """ reshaping of y_true and input to make them fit each other """
        input = tf.reshape(input, (-1,self.hidden_size))
        y_true = tf.reshape(y_true, (-1,1))
      weights = tf.Variable(tf.float64))
      biases = tf.Variable(tf.float64)
      loss = tf.nn.sampled_softmax_loss(
      weights=weights,
      biases=biases,
      labels=labels,
      inputs=inputs,
      ...,
      partition_strategy="div")
      logits = tf.matmul(inputs, tf.transpose(weights))
      logits = tf.nn.bias_add(logits, biases)
       y_predis = tf.nn.softmax_cross_entropy_with_logits_v2(
                                labels=inputs[1],
                                logits=logits) 




3我想我需要一些指向函数API中Model类的哪些部分的指针,因为我知道我必须编写像上面这样的自定义损失函数. 我想问题在于访问tf.nn.sampledsoftmax函数中的权重

3 I guess i need some pointers to which sections of the Model class in the functional API should I mess with -knowing I have to write a custom loss function like above. I guess the issue is accessing the weights in the tf.nn.sampledsoftmax function

推荐答案

我能想到的最简单的方法是定义一个忽略输出层结果的损失.

The simplest approach I can come up with is to define a loss that ignores the result of the output layer.

完整Colab在这里: https://colab.research.google.com/drive/1Rp3EUWnBE1eCcaisUju9TwSTswQfZOkS

Full Colab here: https://colab.research.google.com/drive/1Rp3EUWnBE1eCcaisUju9TwSTswQfZOkS

损失函数.注意,它假定输出层是Dense(activation ='softmax'),并且忽略y_pred.因此,在使用损耗的训练/评估期间,密集层的实际输出为NOP.

The loss function. Note that it assumes that the output layer is a Dense(activation='softmax') and it ignores y_pred. Thus during training / eval where the loss is used the actual output of the Dense layer is a NOP.

进行预测时使用输出层.

The output layer is used when doing predictions.

class SampledSoftmaxLoss(object):
  """ The loss function implements the Dense layer matmul and activation
  when in training mode.
  """
  def __init__(self, model):
    self.model = model
    output_layer = model.layers[-1]
    self.input = output_layer.input
    self.weights = output_layer.weights

  def loss(self, y_true, y_pred, **kwargs):
    labels = tf.argmax(y_true, axis=1)
    labels = tf.expand_dims(labels, -1)
    loss = tf.nn.sampled_softmax_loss(
        weights=self.weights[0],
        biases=self.weights[1],
        labels=labels,
        inputs=self.input,
        num_sampled = 3,
        num_classes = 4,
        partition_strategy = "div",
    )
    return loss

型号:

def make_model():
  inp = Input(shape=(10,))
  h1 = Dense(16, activation='relu')(inp)
  h2 = Dense(4, activation='linear')(h1)
  # output layer and last hidden layer must have the same dims
  out = Dense(4, activation='softmax')(h2)
  model = Model(inp, out)
  loss_calculator = SampledSoftmaxLoss(model)
  model.compile('adam', loss_calculator.loss)
  return model

tf.set_random_seed(42)
model = make_model()
model.summary()

请注意,SampledSoftmaxLoss规定最后一个模型层的输入必须具有与类数相同的尺寸.

Note that the SampledSoftmaxLoss imposes that the inputs of the last model Layer must have the same dimensions as the number of classes.

这篇关于Keras模型中的Softmax采样的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆