TensorBoard 不显示所有数据点 [英] TensorBoard doesn't show all data points

查看:45
本文介绍了TensorBoard 不显示所有数据点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我进行了一个很长的训练(20M 步的强化学习),每 10k 步写一次总结.在第 4M 步和第 6M 步之间,我在 TensorBoard 标量图表中看到了游戏得分的 2 个峰值,然后我让它运行并进入睡眠状态.早上,它以大约 12M 的步长运行,但是我之前看到的步长 4M 和 6M 之间的峰值从图表中消失了.我尝试放大并发现 TensorBoard(随机?)跳过了一些数据点.我也尝试导出数据,但导出的 .csv 中也缺少包括峰值在内的一些数据点.

I was running a very long training (reinforcement learning with 20M steps) and writing summary every 10k steps. In between step 4M and 6M, I saw 2 peaks in my TensorBoard scalar chart for game score, then I let it run and went to sleep. In the morning, it was running at about step 12M, but the peaks between step 4M and 6M that I saw earlier disappeared from the chart. I tried to zoom in and found out that TensorBoard (randomly?) skipped some of the data points. I also tried to export the data but some data point including the peaks are also missing in the exported .csv.

我寻找答案并在 TensorFlow github 页面中找到了这个:

I looked for answers and found this in TensorFlow github page:

TensorBoard 使用水库采样对您的数据进行下采样,以便将其加载到 RAM 中.您可以在 tensorboard/backend/server.py 中修改每个标签将保留的元素数量.

TensorBoard uses reservoir sampling to downsample your data so that it can be loaded into RAM. You can modify the number of elements it will keep per tag in tensorboard/backend/server.py.

有没有人修改过这个 server.py 文件?在哪里可以找到该文件,如果我从源代码安装了 TensorFlow,在修改文件后是否必须重新编译它?

Has anyone ever modified this server.py file? Where can I find the file and if I installed TensorFlow from source, do I have to recompile it after I modified the file?

推荐答案

您不必为此更改源代码,有一个名为 --samples_per_plugin 的标志.

You don't have to change the source code for this, there is a flag called --samples_per_plugin.

从帮助命令中引用

--samples_per_plugin:一个可选的以逗号分隔的 plugin_name=num_samples 对列表,用于明确指定为该插件的每个标签保留多少样本.对于未指定的插件,TensorBoard将记录的摘要随机下采样到合理的值,以防止长时间出现内存不足错误运行作业.此标志允许对下采样进行精细控制.请注意,0 表示保留所有那种类型的样本.例如,"scalars=500,images=0" 保留 500 个标量和所有图像.最多用户应该不需要设置这个标志.(默认:'')

--samples_per_plugin: An optional comma separated list of plugin_name=num_samples pairs to explicitly specify how many samples to keep per tag for that plugin. For unspecified plugins, TensorBoard randomly downsamples logged summaries to reasonable values to prevent out-of-memory errors for long running jobs. This flag allows fine control over that downsampling. Note that 0 means keep all samples of that type. For instance, "scalars=500,images=0" keeps 500 scalars and all images. Most users should not need to set this flag. (default: '')

因此,如果您想要一个包含 100 张图片的滑块,请使用:

So if you want to have a slider of 100 images, use:

tensorboard --samples_per_plugin images=100

这篇关于TensorBoard 不显示所有数据点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆