ctc_loss错误“找不到有效路径." [英] ctc_loss error "No valid path found."

查看:202
本文介绍了ctc_loss错误“找不到有效路径."的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

每次运行火车操作时,用tf.nn.ctc_loss训练模型都会产生错误:

Training a model with tf.nn.ctc_loss produces an error every time the train op is run:

tensorflow/core/util/ctc/ctc_loss_calculator.cc:144] No valid path found.

与先前有关此功能的问题不同,这不是由于差异.我的学习率很低,甚至在第一次火车操作中都会发生错误.

Unlike in previous questions about this function, this is not due to divergence. I have a low learning rate, and the error occurs on even the first train op.

该模型是CNN-> LSTM-> CTC.这是模型创建代码:

The model is a CNN -> LSTM -> CTC. Here is the model creation code:

# Build Graph
self.videoInput = tf.placeholder(shape=(None, self.maxVidLen, 50, 100, 3), dtype=tf.float32)
self.videoLengths = tf.placeholder(shape=(None), dtype=tf.int32)
self.keep_prob = tf.placeholder(dtype=tf.float32)
self.targets = tf.sparse_placeholder(tf.int32)
self.targetLengths = tf.placeholder(shape=(None), dtype=tf.int32)

conv1 = tf.layers.conv3d(self.videoInput ...)
pool1 = tf.layers.max_pooling3d(conv1 ...)
conv2 = ...
pool2 = ...
conv3 = ...
pool3 = ...

cnn_out = tf.reshape(pool3, shape=(-1, self.maxVidLength, 4*7*96))

fw_cell = tf.nn.rnn_cell.MultiRNNCell(self.cell(), for _ in range(3))
bw_cell = tf.nn.rnn_cell.MultiRNNCell(self.cell(), for _ in range(3))
outputs, _ = tf.nn.bidirectional_dynamic_rnn(
            fw_cell, bw_cell, cnn_out, sequence_length=self.videoLengths, dtype=tf.float32)

outputs = tf.concat(outputs, 2)
outputs = tf.reshape(outputs, [-1, self.hidden_size * 2])

w = tf.Variable(tf.random_normal((self.hidden_size * 2, len(self.char2index) + 1), stddev=0.2))
b = tf.Variable(tf.zeros(len(self.char2index) + 1))

out = tf.matmul(outputs, w) + b
out = tf.reshape(out, [-1, self.maxVidLen, len(self.char2index) + 1])
out = tf.transpose(out, [1, 0, 2])

cost = tf.reduce_mean(tf.nn.ctc_loss(self.targets, out, self.targetLengths))
self.train_op = tf.train.AdamOptimizer(0.0001).minimize(cost)

这是feed dict的创建代码:

And here is the feed dict creation code:

indices = []
values = []
shape = [len(vids) * 2, self.maxLabelLen]
vidInput = np.zeros((len(vids) * 2, self.maxVidLen, 50, 100, 3), dtype=np.float32)

# Actual video, then left-right flip
for j in range(len(vids) * 2):

    # K is video index
    k = j if j < len(vids) else j - len(vids)

    # convert video and label to input format
    vidInput[j, 0:len(vids[k])] = vids[k] if k == j else vids[k][:,::-1,:]
    indices.extend([j, i] for i in range(len(labelList[k])))
    values.extend(self.char2index[c] for c in labelList[k])

fd[self.targets] = (indices, values, shape)
fd[self.videoInput] = vidInput

# Collect video lengths and label lengths
vidLengths = [len(j) for j in vids] + [len(j) for j in vids]
labelLens = [len(l) for l in labelList] + [len(l) for l in labelList]
fd[self.videoLengths] = vidLengths
fd[self.targetLengths] = labelLens

推荐答案

事实证明,ctc_loss要求标签长度短于输入长度.如果标签长度太长,损耗计算器将无法完全展开,因此无法计算损耗.

It turns out that the ctc_loss requires that the label lengths be shorter than the input lengths. If the label lengths are too long, the loss calculator cannot unroll completely and therefore cannot compute the loss.

例如,由于在重复的符号之间插入了空格,标签BIFI的输入长度至少为4,而标签BIIF的输入长度至少为5.

For example, the label BIFI would require input length of at least 4 while the label BIIF would require input length of at least 5 due to a blank being inserted between the repeated symbols.

这篇关于ctc_loss错误“找不到有效路径."的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆