如何在分布式环境中使用 Estimator API 在 Tensorboard 中显示运行时统计信息 [英] How to display Runtime Statistics in Tensorboard using Estimator API in a distributed environment

查看:26
本文介绍了如何在分布式环境中使用 Estimator API 在 Tensorboard 中显示运行时统计信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这在单台机器上相当简单.如何在分布式环境中使用 Estimator 做到这一点?

解决方案

您可以使用

This article illustrates how to add Runtime statistics to Tensorboard:

    run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
    run_metadata = tf.RunMetadata()
    summary, _ = sess.run([merged, train_step],
                          feed_dict=feed_dict(True),
                          options=run_options,
                          run_metadata=run_metadata)
    train_writer.add_run_metadata(run_metadata, 'step%d' % i)
    train_writer.add_summary(summary, i)
    print('Adding run metadata for', i)

which creates the following details in Tensorboard:

This is fairly straightforward on a single machine. How could one do this in a distributed environment using Estimators?

解决方案

You may use tf.train.ProfilerHook. However the catch is that it was released at 1.14.

Example usage:

estimator = tf.estimator.LinearClassifier(...)
hooks = [tf.train.ProfilerHook(output_dir=model_dir, save_secs=600, show_memory=False)]
estimator.train(input_fn=train_input_fn, hooks=hooks)

Executing the hook will generate files timeline-xx.json in output_dir.

Then open chrome://tracing/ in chrome browser and load the file. You will get a time usage timeline like below.

这篇关于如何在分布式环境中使用 Estimator API 在 Tensorboard 中显示运行时统计信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆