ctc_loss错误“找不到有效路径." [英] ctc_loss error "No valid path found."
问题描述
每次运行火车操作时,用tf.nn.ctc_loss
训练模型都会产生错误:
Training a model with tf.nn.ctc_loss
produces an error every time the train op is run:
tensorflow/core/util/ctc/ctc_loss_calculator.cc:144] No valid path found.
与先前有关此功能的问题不同,这不是由于差异.我的学习率很低,甚至在第一次火车操作中都会发生错误.
Unlike in previous questions about this function, this is not due to divergence. I have a low learning rate, and the error occurs on even the first train op.
该模型是CNN-> LSTM-> CTC.这是模型创建代码:
The model is a CNN -> LSTM -> CTC. Here is the model creation code:
# Build Graph
self.videoInput = tf.placeholder(shape=(None, self.maxVidLen, 50, 100, 3), dtype=tf.float32)
self.videoLengths = tf.placeholder(shape=(None), dtype=tf.int32)
self.keep_prob = tf.placeholder(dtype=tf.float32)
self.targets = tf.sparse_placeholder(tf.int32)
self.targetLengths = tf.placeholder(shape=(None), dtype=tf.int32)
conv1 = tf.layers.conv3d(self.videoInput ...)
pool1 = tf.layers.max_pooling3d(conv1 ...)
conv2 = ...
pool2 = ...
conv3 = ...
pool3 = ...
cnn_out = tf.reshape(pool3, shape=(-1, self.maxVidLength, 4*7*96))
fw_cell = tf.nn.rnn_cell.MultiRNNCell(self.cell(), for _ in range(3))
bw_cell = tf.nn.rnn_cell.MultiRNNCell(self.cell(), for _ in range(3))
outputs, _ = tf.nn.bidirectional_dynamic_rnn(
fw_cell, bw_cell, cnn_out, sequence_length=self.videoLengths, dtype=tf.float32)
outputs = tf.concat(outputs, 2)
outputs = tf.reshape(outputs, [-1, self.hidden_size * 2])
w = tf.Variable(tf.random_normal((self.hidden_size * 2, len(self.char2index) + 1), stddev=0.2))
b = tf.Variable(tf.zeros(len(self.char2index) + 1))
out = tf.matmul(outputs, w) + b
out = tf.reshape(out, [-1, self.maxVidLen, len(self.char2index) + 1])
out = tf.transpose(out, [1, 0, 2])
cost = tf.reduce_mean(tf.nn.ctc_loss(self.targets, out, self.targetLengths))
self.train_op = tf.train.AdamOptimizer(0.0001).minimize(cost)
这是feed dict的创建代码:
And here is the feed dict creation code:
indices = []
values = []
shape = [len(vids) * 2, self.maxLabelLen]
vidInput = np.zeros((len(vids) * 2, self.maxVidLen, 50, 100, 3), dtype=np.float32)
# Actual video, then left-right flip
for j in range(len(vids) * 2):
# K is video index
k = j if j < len(vids) else j - len(vids)
# convert video and label to input format
vidInput[j, 0:len(vids[k])] = vids[k] if k == j else vids[k][:,::-1,:]
indices.extend([j, i] for i in range(len(labelList[k])))
values.extend(self.char2index[c] for c in labelList[k])
fd[self.targets] = (indices, values, shape)
fd[self.videoInput] = vidInput
# Collect video lengths and label lengths
vidLengths = [len(j) for j in vids] + [len(j) for j in vids]
labelLens = [len(l) for l in labelList] + [len(l) for l in labelList]
fd[self.videoLengths] = vidLengths
fd[self.targetLengths] = labelLens
推荐答案
事实证明,ctc_loss要求标签长度短于输入长度.如果标签长度太长,损耗计算器将无法完全展开,因此无法计算损耗.
It turns out that the ctc_loss requires that the label lengths be shorter than the input lengths. If the label lengths are too long, the loss calculator cannot unroll completely and therefore cannot compute the loss.
例如,由于在重复的符号之间插入了空格,标签BIFI
的输入长度至少为4,而标签BIIF
的输入长度至少为5.
For example, the label BIFI
would require input length of at least 4 while the label BIIF
would require input length of at least 5 due to a blank being inserted between the repeated symbols.
这篇关于ctc_loss错误“找不到有效路径."的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!