如何在 Tensorflow 2.0 中使用嵌入投影仪 [英] How to use the Embedding Projector in Tensorflow 2.0

查看:31
本文介绍了如何在 Tensorflow 2.0 中使用嵌入投影仪的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

随着 tf.contrib 模块从 Tensorflow 中消失,并且 tf.train.Saver() 也消失了,我找不到一种方法来存储一组嵌入及其相应的缩略图,以便 Tensorboard Projector 可以读取它们.

With the tf.contrib module gone from Tensorflow, and with tf.train.Saver() also gone, I cannot find a way to store a set of embeddings and their corresponding thumbnails, so that the Tensorboard Projector can read them.

Tensorflow 2.0 的 Tensorboard 文档解释了如何创建绘图和摘要,以及如何创建一般使用摘要工具,但不使用投影仪工具.有没有人发现如何存储数据集以进行可视化?

The Tensorboard documentation for Tensorflow 2.0 explains how to create plots and summaries, and how to use the summary tool in general, but nothing about the projector tool. Has anyone found how to store datasets for visualization?

如果可能,我希望有一个(最小的)代码示例.

If possible, I would appreciate a (minimal) code example.

推荐答案

tensorboard 中似乎存在一些问题.然而,有一些使用 tensorflow2 为投影仪准备嵌入的解决方法(目前):(错误报告:https://github.com/tensorflow/tensorboard/issues/2471)

It seems there are some issues left in tensorboard. However, there are some workarounds (for now) for preparing embeddings for projector with tensorflow2: (bug report at: https://github.com/tensorflow/tensorboard/issues/2471)

tensorflow1 代码看起来像这样:

tensorflow1 code would look something like that:

embeddings = tf.compat.v1.Variable(latent_data, name='embeddings')
CHECKPOINT_FILE = TENSORBOARD_DIR + '/model.ckpt'
# Write summaries for tensorboard
with tf.compat.v1.Session() as sess:
    saver = tf.compat.v1.train.Saver([embeddings])
    sess.run(embeddings.initializer)
    saver.save(sess, CHECKPOINT_FILE)
    config = projector.ProjectorConfig()
    embedding = config.embeddings.add()
    embedding.tensor_name = embeddings.name
    embedding.metadata_path = TENSORBOARD_METADATA_FILE

projector.visualize_embeddings(tf.summary.FileWriter(TENSORBOARD_DIR), config)

在 tensorflow2 中使用 Eager 模式时,这应该 (?) 看起来像这样:

when using eager mode in tensorflow2 this should (?) look somehow like this:

embeddings = tf.Variable(latent_data, name='embeddings')
CHECKPOINT_FILE = TENSORBOARD_DIR + '/model.ckpt'
ckpt = tf.train.Checkpoint(embeddings=embeddings)
ckpt.save(CHECKPOINT_FILE)

config = projector.ProjectorConfig()
embedding = config.embeddings.add()
embedding.tensor_name = embeddings.name
embedding.metadata_path = TENSORBOARD_METADATA_FILE

writer = tf.summary.create_file_writer(TENSORBOARD_DIR)
projector.visualize_embeddings(writer, config)

但是,有两个问题:

  • 使用tf.summary.create_file_writer 创建的writer 没有projector.visualize_embeddings<所需的函数get_logdir()/code>,一个简单的解决方法是修补 visualize_embeddings 函数以将 logdir 作为参数.
  • 检查点格式已更改,当使用 load_checkpoint 读取检查点时(这似乎是加载文件的 tensorboard 方式),变量名称发生了变化.例如embeddings 更改为类似 embeddings/.ATTRIBUTES/VARIABLE_VALUE 的内容(在 get_variable_to_shape_map() 提取的地图中还有其他变量,但它们是空的无论如何).
  • the writer created with tf.summary.create_file_writer does not have the function get_logdir() required by projector.visualize_embeddings, a simple workaround is to patch the visualize_embeddings function to take the logdir as parameter.
  • the checkpoint format has changed, when reading the checkpoint with load_checkpoint (which seems to be the tensorboard way of loading the file), the variable names change. e.g. embeddings changes to something like embeddings/.ATTRIBUTES/VARIABLE_VALUE (also there are additional variables in the map extracted by get_variable_to_shape_map()but they are empty anyways).

第二个问题已通过以下快速而肮脏的解决方法解决(并且 logdir 现在是 visualize_embeddings() 的参数)

the second issue was solved with the following quick-and-dirty workaround (and logdir is now a parameter of visualize_embeddings())

embeddings = tf.Variable(latent_data, name='embeddings')
CHECKPOINT_FILE = TENSORBOARD_DIR + '/model.ckpt'
ckpt = tf.train.Checkpoint(embeddings=embeddings)
ckpt.save(CHECKPOINT_FILE)

reader = tf.train.load_checkpoint(TENSORBOARD_DIR)
map = reader.get_variable_to_shape_map()
key_to_use = ""
for key in map:
    if "embeddings" in key:
        key_to_use = key

config = projector.ProjectorConfig()
embedding = config.embeddings.add()
embedding.tensor_name = key_to_use
embedding.metadata_path = TENSORBOARD_METADATA_FILE

writer = tf.summary.create_file_writer(TENSORBOARD_DIR)
projector.visualize_embeddings(writer, config,TENSORBOARD_DIR)

我没有找到任何关于如何使用 tensorflow2 直接为 tensorboard 编写嵌入的示例,所以我不确定这是否是正确的方法,但如果是,则需要解决这两个问题,并且至少现在有一个解决方法.

I did not find any examples on how to use tensorflow2 to directly write the embeddings for tensorboard, so I am not sure if this is the right way, but if it is, then those two issues would need to be addressed, and at least for now there is a workaround.

这篇关于如何在 Tensorflow 2.0 中使用嵌入投影仪的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆