在张量板中记录训练和验证损失 [英] Logging training and validation loss in tensorboard

查看:34
本文介绍了在张量板中记录训练和验证损失的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习如何使用 tensorflow 和 tensorboard.我有一个基于 MNIST 神经网络教程的测试项目.

I'm trying to learn how to use tensorflow and tensorboard. I have a test project based on the MNIST neural net tutorial.

在我的代码中,我构造了一个节点来计算数据集中正确分类的数字的分数,如下所示:

In my code, I construct a node that calculates the fraction of digits in a data set that are correctly classified, like this:

correct = tf.nn.in_top_k(self._logits, labels, 1)
correct = tf.to_float(correct)
accuracy = tf.reduce_mean(correct)

这里,self._logits是图的推理部分,labels是包含正确标签的占位符.

Here, self._logitsis the inference part of the graph, and labels is a placeholder that contains the correct labels.

现在,我想做的是随着训练的进行评估训练集和验证集的准确性.我可以通过使用不同的 feed_dicts 两次运行准确度节点来做到这一点:

Now, what I would like to do is evaluate the accuracy for both the training set and the validation set as training proceeds. I can do this by running the accuracy node twice, with different feed_dicts:

train_acc = tf.run(accuracy, feed_dict={images : training_set.images, labels : training_set.labels})
valid_acc = tf.run(accuracy, feed_dict={images : validation_set.images, labels : validation_set.labels})

这按预期工作.我可以打印这些值,我可以看到,最初,两个准确度都会增加,最终验证准确度会趋于平缓,而训练准确度会不断提高.

This works as intended. I can print the values, and I can see that initially, the two accuracies will both increase, and eventually the validation accuracy will flatten out while the training accuracy keeps increasing.

但是,我也想在张量板中获得这些值的图表,但我不知道如何做到这一点.如果我简单地将 scalar_summary 添加到 accuracy,记录的值将无法区分训练集和验证集.

However, I would also like to get graphs of these values in tensorboard, and I can not figure out how to do this. If I simply add a scalar_summary to accuracy, the logged values will not distinguish between training set and validation set.

我还尝试使用不同的名称创建两个相同的 accuracy 节点,并在训练集上运行一个,在验证集上运行一个.然后我向每个节点添加一个 scalar_summary.这确实给了我张量板中的两张图,但不是一张图显示训练集的准确性,一个显示验证集的准确性,它们都显示相同的值,但与打印到终端的任何一个都不匹配.

I also tried creating two identical accuracy nodes with different names and running one on the training set and one on the validation set. I then add a scalar_summary to each of these nodes. This does give me two graphs in tensorboard, but instead of one graph showing the training set accuracy and one showing the validation set accuracy, they are both showing identical values that do not match either of the ones printed to the terminal.

我可能误解了如何解决这个问题.为不同输入分别记录单个节点的输出的推荐方法是什么?

I am probably misunderstanding how to solve this problem. What is the recommended way of separately logging the output from a single node for different inputs?

推荐答案

有几种不同的方法可以实现这一点,但您在创建不同的tf.summary.scalar() 节点.由于您必须明确调用 SummaryWriter.add_summary() 每次您想要将数量记录到事件文件中时,最简单的方法可能是每次您想要获得训练或验证准确度时获取适当的摘要节点:

There are several different ways you could achieve this, but you're on the right track with creating different tf.summary.scalar() nodes. Since you must explicitly call SummaryWriter.add_summary() each time you want to log a quantity to the event file, the simplest approach is probably to fetch the appropriate summary node each time you want to get the training or validation accuracy:

accuracy = tf.reduce_mean(correct)

training_summary = tf.summary.scalar("training_accuracy", accuracy)
validation_summary = tf.summary.scalar("validation_accuracy", accuracy)


summary_writer = tf.summary.FileWriter(...)

for step in xrange(NUM_STEPS):

  # Perform a training step....

  if step % LOG_PERIOD == 0:

    # To log training accuracy.
    train_acc, train_summ = sess.run(
        [accuracy, training_summary], 
        feed_dict={images : training_set.images, labels : training_set.labels})
    writer.add_summary(train_summ, step) 

    # To log validation accuracy.
    valid_acc, valid_summ = sess.run(
        [accuracy, validation_summary],
        feed_dict={images : validation_set.images, labels : validation_set.labels})
    writer.add_summary(valid_summ, step)

或者,您可以创建一个标签为 tf.placeholder(tf.string, []) 并根据需要提供字符串 "training_accuracy""validation_accuracy".

Alternatively, you could create a single summary op whose tag is a tf.placeholder(tf.string, []) and feed the string "training_accuracy" or "validation_accuracy" as appropriate.

这篇关于在张量板中记录训练和验证损失的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆