如何在数据管道中获取当前的 global_step [英] How to get current global_step in data pipeline
问题描述
我正在尝试创建一个过滤器,该过滤器取决于训练的当前 global_step
,但我未能正确执行此操作.
I am trying to create a filter which depends on the current global_step
of the training but I am failing to do so properly.
首先,我不能在下面的代码中使用 tf.train.get_or_create_global_step()
因为它会抛出
First, I cannot use tf.train.get_or_create_global_step()
in the code below because it will throw
ValueError: Variable global_step already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
这就是我尝试使用 tf.get_default_graph().get_name_scope()
获取范围的原因,并且在该上下文中我能够get"全局步骤:
This is why I tried fetching the scope with tf.get_default_graph().get_name_scope()
and within that context I was able to "get" the global step:
def filter_examples(example):
scope = tf.get_default_graph().get_name_scope()
with tf.variable_scope(scope, reuse=tf.AUTO_REUSE):
current_step = tf.train.get_or_create_global_step()
subtokens_by_step = tf.floor(current_step / curriculum_step_update)
max_subtokens = min_subtokens + curriculum_step_size * tf.cast(subtokens_by_step, dtype=tf.int32)
return tf.size(example['targets']) <= max_subtokens
dataset = dataset.filter(filter_examples)
问题在于它似乎不像我预期的那样工作.从我观察到的,上面代码中的 current_step
似乎一直都是 0(我不知道,只是根据我的观察我假设).
The problem with this is that it does not seem to work as I expected. From what I am observing, the current_step
in the code above seems to be 0 all the time (I don't know that, just based on my observations I assume that).
唯一似乎有所作为且听起来很奇怪的事情是重新开始训练.我认为,同样基于观察,在这种情况下 current_step
将是此时训练的实际当前步骤.但随着训练的继续,该值本身不会更新.
The only thing that seems to make a difference, and it sounds weird, is restarting the training. I think, also based on observations, in that case current_step
will be the actual current step of the training at this point. But the value itself won't update as the training continues.
是否有办法获取当前步骤的实际值并像上面一样在我的过滤器中使用它?
If there a way to get the actual value of the current step and use it in my filter like above?
张量流 1.12.1
推荐答案
正如我们在评论中所讨论的,拥有和更新您自己的计数器可能是使用 global_step
变量的替代方法.counter
变量可以更新如下:
As we discussed in the comments, having and updating your own counter might be an alternative to using the global_step
variable. The counter
variable could be updated as follows:
op = tf.assign_add(counter, 1)
with tf.control_dependencies(op):
# Some operation here before which the counter should be updated
使用 tf.control_dependencies 允许附加"counter
到计算图中的路径.然后您可以在任何需要的地方使用 counter
变量.
Using tf.control_dependencies allows to "attach" the update of counter
to a path within the computational graph. You can then use the counter
variable wherever you need it.
这篇关于如何在数据管道中获取当前的 global_step的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!