在张量板中记录训练和验证损失 [英] Logging training and validation loss in tensorboard
问题描述
我正在尝试学习如何使用 tensorflow 和 tensorboard.我有一个基于 MNIST 神经网络教程的测试项目.
I'm trying to learn how to use tensorflow and tensorboard. I have a test project based on the MNIST neural net tutorial.
在我的代码中,我构造了一个节点来计算数据集中正确分类的数字的分数,如下所示:
In my code, I construct a node that calculates the fraction of digits in a data set that are correctly classified, like this:
correct = tf.nn.in_top_k(self._logits, labels, 1)
correct = tf.to_float(correct)
accuracy = tf.reduce_mean(correct)
这里,self._logits
是图的推理部分,labels
是包含正确标签的占位符.
Here, self._logits
is the inference part of the graph, and labels
is a placeholder that contains the correct labels.
现在,我想做的是随着训练的进行评估训练集和验证集的准确性.我可以通过使用不同的 feed_dicts 两次运行准确度节点来做到这一点:
Now, what I would like to do is evaluate the accuracy for both the training set and the validation set as training proceeds. I can do this by running the accuracy node twice, with different feed_dicts:
train_acc = tf.run(accuracy, feed_dict={images : training_set.images, labels : training_set.labels})
valid_acc = tf.run(accuracy, feed_dict={images : validation_set.images, labels : validation_set.labels})
这按预期工作.我可以打印这些值,我可以看到,最初,两个准确度都会增加,最终验证准确度会趋于平缓,而训练准确度会不断提高.
This works as intended. I can print the values, and I can see that initially, the two accuracies will both increase, and eventually the validation accuracy will flatten out while the training accuracy keeps increasing.
但是,我也想在张量板中获得这些值的图表,但我不知道如何做到这一点.如果我简单地将 scalar_summary
添加到 accuracy
,记录的值将无法区分训练集和验证集.
However, I would also like to get graphs of these values in tensorboard, and I can not figure out how to do this. If I simply add a scalar_summary
to accuracy
, the logged values will not distinguish between training set and validation set.
我还尝试使用不同的名称创建两个相同的 accuracy
节点,并在训练集上运行一个,在验证集上运行一个.然后我向每个节点添加一个 scalar_summary
.这确实给了我张量板中的两张图,但不是一张图显示训练集的准确性,一个显示验证集的准确性,它们都显示相同的值,但与打印到终端的任何一个都不匹配.
I also tried creating two identical accuracy
nodes with different names and running one on the training set and one on the validation set. I then add a scalar_summary
to each of these nodes. This does give me two graphs in tensorboard, but instead of one graph showing the training set accuracy and one showing the validation set accuracy, they are both showing identical values that do not match either of the ones printed to the terminal.
我可能误解了如何解决这个问题.为不同输入分别记录单个节点的输出的推荐方法是什么?
I am probably misunderstanding how to solve this problem. What is the recommended way of separately logging the output from a single node for different inputs?
推荐答案
有几种不同的方法可以实现这一点,但您在创建不同的tf.summary.scalar()
节点.由于您必须明确调用 SummaryWriter.add_summary()
每次您想要将数量记录到事件文件中时,最简单的方法可能是每次您想要获得训练或验证准确度时获取适当的摘要节点:
There are several different ways you could achieve this, but you're on the right track with creating different tf.summary.scalar()
nodes. Since you must explicitly call SummaryWriter.add_summary()
each time you want to log a quantity to the event file, the simplest approach is probably to fetch the appropriate summary node each time you want to get the training or validation accuracy:
accuracy = tf.reduce_mean(correct)
training_summary = tf.summary.scalar("training_accuracy", accuracy)
validation_summary = tf.summary.scalar("validation_accuracy", accuracy)
summary_writer = tf.summary.FileWriter(...)
for step in xrange(NUM_STEPS):
# Perform a training step....
if step % LOG_PERIOD == 0:
# To log training accuracy.
train_acc, train_summ = sess.run(
[accuracy, training_summary],
feed_dict={images : training_set.images, labels : training_set.labels})
writer.add_summary(train_summ, step)
# To log validation accuracy.
valid_acc, valid_summ = sess.run(
[accuracy, validation_summary],
feed_dict={images : validation_set.images, labels : validation_set.labels})
writer.add_summary(valid_summ, step)
或者,您可以创建一个标签为 tf.placeholder(tf.string, [])
并根据需要提供字符串 "training_accuracy"
或 "validation_accuracy"
.
Alternatively, you could create a single summary op whose tag is a tf.placeholder(tf.string, [])
and feed the string "training_accuracy"
or "validation_accuracy"
as appropriate.
这篇关于在张量板中记录训练和验证损失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!