哪种 tensorflow 方法确实决定了特定批次的示例供模型学习? [英] Which tensorflow method does decide to a particular batch of examples is for the model to learn?

查看：32 发布时间：2021/9/5 20:03:43 tensorflow

本文介绍了哪种 tensorflow 方法确实决定了特定批次的示例供模型学习?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试了解 SGD 在 tensorflow 中的实现.

I'm trying to understand the implementation of SGD in tensorflow.

根据 keras doc，优化器需要实现_resource_apply_dense 方法，对应于代码(部分)如下所示:

Per keras doc, an optimizer needs to implement _resource_apply_dense method, which corresponds with the code (partly) shown below:

def _resource_apply_dense(self, grad, var, apply_state=None):
    var_device, var_dtype = var.device, var.dtype.base_dtype
    coefficients = ((apply_state or {}).get((var_device, var_dtype))
                    or self._fallback_apply_state(var_device, var_dtype))

    if self._momentum:
    momentum_var = self.get_slot(var, "momentum")
    return gen_training_ops.ResourceApplyKerasMomentum(
        ...

我想知道谁将 var 变量传递给 _resource_apply_dense 方法?换句话说，哪种方法决定这批特定的样本供模型学习?

I'd like to know who passes the var variable to the _resource_apply_dense method? In other words, which method decides this particular batch of examples is for the model to learn?

推荐答案

检查 optimizer_v2 或 tensorflow keras，我们发现在整个 tensorflow 代码库中只使用了这个函数:

Checking the optimizer_v2 or tensorflow keras, we find the only use of this function in the entire tensorflow codebase:

   #...
   def apply_grad_to_update_var(var, grad):
      #...
      if "apply_state" in self._dense_apply_args:
        apply_kwargs["apply_state"] = apply_state
      update_op = self._resource_apply_dense(grad, var, **apply_kwargs)
      if var.constraint is not None:
        with ops.control_dependencies([update_op]):
          return var.assign(var.constraint(var))

我们稍后在同一个文件中看到 var 变量来自 _distributed_apply 函数的参数:

We later see on that same file that the var variable comes from an argument to the _distributed_apply function:

#...
def _distributed_apply(self, distribution, grads_and_vars, name, apply_state):
    #...
    with name_scope_only_in_function_or_graph(name or self._name):
      for grad, var in grads_and_vars:
      #...

最后，grads_and_vars 参数定义为 (梯度，变量)对列表 apply_gradients:

Finally, the grads_and_vars argument is defined as List of (gradient, variable) pairs in the function apply_gradients:

  #...
  def apply_gradients(self,
                      grads_and_vars,
    #...
    """...
    Args:
      grads_and_vars: List of (gradient, variable) pairs.
    """

如果您检查 apply_gradients (此搜索)，你会看到这是更新网络权重的常用方法，因此由更新"控制.优化器的步骤.

If you check the occurrences of apply_gradients (this search), you will see that it is a common way to update the weights of the network, and is thus controlled by the "update" step of the optimizer.

这篇关于哪种 tensorflow 方法确实决定了特定批次的示例供模型学习?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

哪种 tensorflow 方法确实决定了特定批次的示例供模型学习? [英] Which tensorflow method does decide to a particular batch of examples is for the model to learn?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

哪种 tensorflow 方法确实决定了特定批次的示例供模型学习? [英] Which tensorflow method does decide to a particular batch of examples is for the model to learn?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭