如何使用 Tensorboard 在同一图上绘制不同的汇总指标? [英] How to plot different summary metrics on the same plot with Tensorboard?

查看:56
本文介绍了如何使用 Tensorboard 在同一图上绘制不同的汇总指标?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望能够绘制每批次的训练损失平均验证损失在同一图上的验证集张量板.当我的验证集太大而无法放入内存时,我遇到了这个问题,因此需要批处理和使用 tf.metrics 更新操作.

I would like to be able plot the training loss per batch and the average validation loss for the validation set on the same plot in Tensorboard. I ran into this issue when my validation set was too large to fit into memory so required batching and the use of tf.metrics update ops.

此问题适用于您希望在 Tensorboard 中的同一图表上显示的任何 Tensorflow 指标.

This question could apply to any Tensorflow metrics you wanted to appear on the same graph in Tensorboard.

我可以

  • 分别绘制这两个图(参见此处)
  • 在与 training-loss-per-training-batch 相同的图表上绘制 validation-loss-per-validation-batchset 可以是单个批次,我可以重复使用下面的训练摘要操作 train_summ)
  • plot these two graphs separately (see here)
  • plot the validation-loss-per-validation-batch on the same graph as the training-loss-per-training-batch (this was OK when the validation set could be a single batch and I could reuse the training summary op train_summ below)

在下面的示例代码中,我的问题源于这样一个事实,即我的验证摘要 tf.summary.scalarname=loss 被重命名为 loss_1 并因此移动到 Tensorboard 中的单独图形.据我所知,Tensorboard 采用 同名" 并将它们绘制在同一个图形上,而不管它们在哪个文件夹中.这是令人沮丧的 train_summ(名称=loss) 只写入 train 文件夹,valid_summ (name=loss) 只写入 valid 文件夹 - 但仍然是重命名为 loss_1.

In the example code below, my issue stems from the fact that my validation summary tf.summary.scalar with name=loss gets renamed to loss_1 and thus is moved to a separate graph in Tensorboard. From what I can work out Tensorboard takes "same name" and plots them on the same graph, regardless of what folder they are in. This is frustrating as train_summ (name=loss) is only ever written to the train folder and valid_summ (name=loss) is only ever written to the valid folder - but is still renamed to loss_1.

示例代码:

# View graphs with (Linux): $ tensorboard --logdir=/tmp/my_tf_model

import tensorflow as tf
import numpy as np
import os
import tempfile

def train_data_gen():
    yield np.random.normal(size=[3]), np.array([0.5, 0.5, 0.5])

def valid_data_gen():
    yield np.random.normal(size=[3]), np.array([0.8, 0.8, 0.8])

batch_size = 25
n_training_batches = 4
n_valid_batches = 2
n_epochs = 5
summary_loc = os.path.join(tempfile.gettempdir(), 'my_tf_model')
print("Summaries written to" + summary_loc)

# Dummy data
train_data = tf.data.Dataset.from_generator(train_data_gen, (tf.float32, tf.float32)).repeat().batch(batch_size)
valid_data = tf.data.Dataset.from_generator(valid_data_gen, (tf.float32, tf.float32)).repeat().batch(batch_size)
handle = tf.placeholder(tf.string, shape=[])
iterator = tf.data.Iterator.from_string_handle(handle, 
train_data.output_types, train_data.output_shapes)
batch_x, batch_y = iterator.get_next()
train_iter = train_data.make_initializable_iterator()
valid_iter = valid_data.make_initializable_iterator()

# Some ops on the data
loss = tf.losses.mean_squared_error(batch_x, batch_y)
valid_loss, valid_loss_update = tf.metrics.mean(loss)

# Write to summaries
train_summ = tf.summary.scalar('loss', loss)
valid_summ = tf.summary.scalar('loss', valid_loss)  # <- will be renamed to "loss_1"

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    train_handle, valid_handle = sess.run([train_iter.string_handle(), valid_iter.string_handle()])
    sess.run([train_iter.initializer, valid_iter.initializer])

    # Summary writers
    writer_train = tf.summary.FileWriter(os.path.join(summary_loc, 'train'), sess.graph)
    writer_valid = tf.summary.FileWriter(os.path.join(summary_loc, 'valid'), sess.graph)

    global_step = 0  # implicit as no actual training
    for i in range(n_epochs):
        # "Training"
        for j in range(n_training_batches):
            global_step += 1
            summ = sess.run(train_summ, feed_dict={handle: train_handle})
            writer_train.add_summary(summary=summ, global_step=global_step)
        # "Validation"
        sess.run(tf.local_variables_initializer())
        for j in range(n_valid_batches):
             _, batch_summ = sess.run([valid_loss_update, train_summ], feed_dict={handle: valid_handle})
            # The following will plot the batch loss for the validation set on the loss plot with the training data:
            # writer_valid.add_summary(summary=batch_summ, global_step=global_step + j + 1)
        summ = sess.run(valid_summ)
        writer_valid.add_summary(summary=summ, global_step=global_step)  # <- I want this on the training loss graph

我尝试了什么

  • 按照 这个问题这个问题(想想我在那个问题的评论中提到了什么)
  • 使用tf.summary.merge 将我所有的训练和验证/测试指标合并到整体摘要操作中;做有用的簿记,但没有在同一张图上绘制我想要的
  • 使用 tf.summary.scalar family 属性(loss 仍重命名为 loss_1)
  • (完整的 hack 解决方案)training 数据使用 valid_loss, valid_loss_update = tf.metrics.mean(loss) 然后运行 ​​tf.local_variables_initializer() 每个训练批次.这确实为您提供了相同的摘要操作,从而将事情放在同一个图表上,但肯定不是您有意这样做的方式?它也不能推广到其他指标.
  • What I have tried

    • Separate tf.summary.FileWriter objects (one for training, one for validation), as recommended by this issue and this question (think what I'm after is alluded to in the comment of that question)
    • The use of tf.summary.merge to merge all my training and validation/test metrics into overall summary ops; does useful book-keeping but doesn't plot what I want on the same graph
    • Use of the tf.summary.scalar family attribute (loss still gets renamed to loss_1)
    • (Complete hack solution) Use valid_loss, valid_loss_update = tf.metrics.mean(loss) on the training data and then run tf.local_variables_initializer() every training batch. This does give you the same summary op and thus puts things on the same graph but is surely not how you're meant to do this? It also doesn't generalise to other metrics.
      • TensorFlow 1.9.0
      • 张量板 1.9.0
      • Python 3.5.2

      推荐答案

      Tensorboard custom_scalar plugin 是解决这个问题的方法.

      The Tensorboard custom_scalar plugin is the way to solve this problem.

      这里是同样的例子,使用 custom_scalar 在同一图上绘制两个损失(每个训练批次 + 所有验证批次的平均值):

      Here's the same example again with a custom_scalar to plot the two losses (per training batch + averaged over all validation batches) on the same plot:

      # View graphs with (Linux): $ tensorboard --logdir=/tmp/my_tf_model
      
      import os
      import tempfile
      import tensorflow as tf
      import numpy as np
      from tensorboard import summary as summary_lib
      from tensorboard.plugins.custom_scalar import layout_pb2
      
      def train_data_gen():
          yield np.random.normal(size=[3]), np.array([0.5, 0.5, 0.5])
      
      def valid_data_gen():
          yield np.random.normal(size=[3]), np.array([0.8, 0.8, 0.8])
      
      batch_size = 25
      n_training_batches = 4
      n_valid_batches = 2
      n_epochs = 5
      summary_loc = os.path.join(tempfile.gettempdir(), 'my_tf_model')
      print("Summaries written to " + summary_loc)
      
      # Dummy data
      train_data = tf.data.Dataset.from_generator(
          train_data_gen, (tf.float32, tf.float32)).repeat().batch(batch_size)
      valid_data = tf.data.Dataset.from_generator(
          valid_data_gen, (tf.float32, tf.float32)).repeat().batch(batch_size)
      handle = tf.placeholder(tf.string, shape=[])
      iterator = tf.data.Iterator.from_string_handle(handle, train_data.output_types,
                                                     train_data.output_shapes)
      batch_x, batch_y = iterator.get_next()
      train_iter = train_data.make_initializable_iterator()
      valid_iter = valid_data.make_initializable_iterator()
      
      # Some ops on the data
      loss = tf.losses.mean_squared_error(batch_x, batch_y)
      valid_loss, valid_loss_update = tf.metrics.mean(loss)
      
      with tf.name_scope('loss'):
          train_summ = summary_lib.scalar('training', loss)
          valid_summ = summary_lib.scalar('valid', valid_loss)
      
      with tf.Session() as sess:
          sess.run(tf.global_variables_initializer())
          train_handle, valid_handle = sess.run([train_iter.string_handle(), valid_iter.string_handle()])
          sess.run([train_iter.initializer, valid_iter.initializer])
      
          writer_train = tf.summary.FileWriter(os.path.join(summary_loc, 'train'), sess.graph)
          writer_valid = tf.summary.FileWriter(os.path.join(summary_loc, 'valid'), sess.graph)
      
          layout_summary = summary_lib.custom_scalar_pb(
              layout_pb2.Layout(category=[
                  layout_pb2.Category(
                      title='losses',
                      chart=[
                          layout_pb2.Chart(
                              title='losses',
                              multiline=layout_pb2.MultilineChartContent(tag=[
                                  'loss/training', 'loss/valid'
                              ]))
                      ])
              ]))
          writer_train.add_summary(layout_summary)
      
          global_step = 0
          for i in range(n_epochs):
              for j in range(n_training_batches): # "Training"
                  global_step += 1
                  summ = sess.run(train_summ, feed_dict={handle: train_handle})
                  writer_train.add_summary(summary=summ, global_step=global_step)
      
              sess.run(tf.local_variables_initializer())
              for j in range(n_valid_batches):  # "Validation"
                  _, batch_summ = sess.run([valid_loss_update, train_summ], feed_dict={handle: valid_handle})
              summ = sess.run(valid_summ)
              writer_valid.add_summary(summary=summ, global_step=global_step)
      

      这是 Tensorboard 中的结果输出.

      这篇关于如何使用 Tensorboard 在同一图上绘制不同的汇总指标?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆