TensorFlow:执行此损失计算 [英] TensorFlow: Performing this loss computation

查看:53
本文介绍了TensorFlow:执行此损失计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题和问题在两个代码块的下方说明.

My question and problem is stated below the two blocks of code.

def loss(labels, logits, sequence_lengths, label_lengths, logit_lengths):    
    scores = []
    for i in xrange(runner.batch_size):
        sequence_length = sequence_lengths[i]
        for j in xrange(length):
            label_length = label_lengths[i, j]
            logit_length = logit_lengths[i, j]

             # get top k indices <==> argmax_k(labels[i, j, 0, :], label_length)
            top_labels = np.argpartition(labels[i, j, 0, :], -label_length)[-label_length:]
            top_logits = np.argpartition(logits[i, j, 0, :], -logit_length)[-logit_length:]

            scores.append(edit_distance(top_labels, top_logits))

    return np.mean(scores)

# Levenshtein distance
def edit_distance(s, t):
    n = s.size
    m = t.size
    d = np.zeros((n+1, m+1))
    d[:, 0] = np.arrange(n+1)
    d[0, :] = np.arrange(n+1)

    for j in xrange(1, m+1):
        for i in xrange(1, n+1):
            if s[i] == t[j]:
                d[i, j] = d[i-1, j-1]
            else:
                d[i, j] = min(d[i-1, j] + 1,
                              d[i, j-1] + 1,
                              d[i-1, j-1] + 1)

    return d[m, n]


正在使用

我试图将代码扁平化,以使所有事情都在一个地方发生.让我知道是否有错别字/混淆点.


Being used in

I've tried to flatten my code so that everything is happening in one place. Let me know if there are typos/points of confusion.

sequence_lengths_placeholder = tf.placeholder(tf.int64, shape=(batch_size))
labels_placeholder = tf.placeholder(tf.float32, shape=(batch_size, max_feature_length, label_size))
label_lengths_placeholder = tf.placeholder(tf.int64, shape=(batch_size, max_feature_length))
loss_placeholder = tf.placeholder(tf.float32, shape=(1))

logit_W = tf.Variable(tf.zeros([lstm_units, label_size]))
logit_b = tf.Variable(tf.zeros([label_size]))

length_W = tf.Variable(tf.zeros([lstm_units, max_length]))
length_b = tf.Variable(tf.zeros([max_length]))

lstm = rnn_cell.BasicLSTMCell(lstm_units)
stacked_lstm = rnn_cell.MultiRNNCell([lstm] * layer_count)

rnn_out, state = rnn.rnn(stacked_lstm, features, dtype=tf.float32, sequence_length=sequence_lengths_placeholder)

logits = tf.concat(1, [tf.reshape(tf.matmul(t, logit_W) + logit_b, [batch_size, 1, 2, label_size]) for t in rnn_out])

logit_lengths = tf.concat(1, [tf.reshape(tf.matmul(t, length_W) + length_b, [batch_size, 1, max_length]) for t in rnn_out])

optimizer = tf.train.AdamOptimizer(learning_rate)
global_step = tf.Variable(0, name='global_step', trainable=False)
train_op = optimizer.minimize(loss_placeholder, global_step=global_step)

...
...
# Inside training loop

np_labels, np_logits, sequence_lengths, label_lengths, logit_lengths = sess.run([labels_placeholder, logits, sequence_lengths_placeholder, label_lengths_placeholder, logit_lengths], feed_dict=feed_dict)
loss = loss(np_labels, np_logits, sequence_lengths, label_lengths, logit_lengths)
_ = sess.run([train_op], feed_dict={loss_placeholder: loss})


我的问题

问题在于,这将返回错误:


My issue

The issue is that this is returning the error:

  File "runner.py", line 63, in <module>
    train_op = optimizer.minimize(loss_placeholder, global_step=global_step)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 188, in minimize
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 277, in apply_gradients
    (grads_and_vars,))

  ValueError: No gradients provided for any variable: <all my variables>

所以我认为这是TensorFlow抱怨它无法计算损失的梯度,因为损失是由numpy执行的,超出了TF的范围.

So I assume that this is TensorFlow complaining that it can't compute the gradients of my loss because the loss is performed by numpy, outside the scope of TF.

很自然地解决了这一问题,我将尝试在TensorFlow中实现它.问题是,我的logit_lengthslabel_lengths都是张量,因此当我尝试访问单个元素时,我返回了形状为[]的张量.当我尝试使用将Int作为其k参数的tf.nn.top_k()时,这是一个问题.

So naturally to fix that I would try and implement this in TensorFlow. The issue is, my logit_lengths and label_lengths are both Tensors, so when I try and access a single element, I'm returned a Tensor of shape []. This is an issue when I'm trying to use tf.nn.top_k() which takes an Int for its k parameter.

与此有关的另一个问题是我的label_lengths是占位符,由于需要在optimizer.minimize(loss)调用之前定义我的loss值,所以我还收到一条错误消息,提示需要为占位符传递一个值.

Another issue with that is my label_lengths is a Placeholder and since my loss value need to be defined before the optimizer.minimize(loss) call, I also get an error that says a value needs to be passed for the placeholder.

我只是想知道如何尝试实现此损失功能.或者,如果我缺少明显的东西.

I'm just wondering how I could try and implement this loss function. Or if I'm missing something obvious.

编辑:经过一些进一步阅读之后,我发现通常像我描述的那样的损失用于验证和训练替代损失,该损失与使用真实损失的位置相同.有人知道像我的基于编辑距离的方案使用什么替代损耗吗?

After some further reading I see that usually losses like the one I describe are used in validation and in training a surrogate loss that minimizes in the same place as the true loss is used. Does anyone know what surrogate loss is used for an edit distance based scenario like mine?

推荐答案

我要做的第一件事是使用tensorflow而不是numpy计算损失.这将使tensorflow能够为您计算梯度,因此您将能够向后传播,这意味着您可以将损失降到最低.

The first thing I would do is to calculate loss using tensorflow instead of numpy. That will allow tensorflow to compute gradients for you, so you will be able to back-propagate, meaning you can minimize the loss.

有tf.edit_distance( https://www.tensorflow.org/api_docs /python/tf/edit_distance )核心库中的函数.

There is tf.edit_distance(https://www.tensorflow.org/api_docs/python/tf/edit_distance) function in the core library.

很自然地解决了这一问题,我将尝试在TensorFlow中实现它.问题是,我的logit_lengths和label_lengths都是张量,因此当我尝试访问单个元素时,我返回了形状为[]的张量.当我尝试使用tf.nn.top_k()并将其Int作为其k参数时,这是一个问题.

So naturally to fix that I would try and implement this in TensorFlow. The issue is, my logit_lengths and label_lengths are both Tensors, so when I try and access a single element, I'm returned a Tensor of shape []. This is an issue when I'm trying to use tf.nn.top_k() which takes an Int for its k parameter.

您能否提供更多细节来解释为什么这是一个问题?

Could you provide a little bit more details why it is an issue?

这篇关于TensorFlow:执行此损失计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆