Tensorflow 何时更新权重和偏差? [英] When does Tensorflow update weights and biases?

查看:30
本文介绍了Tensorflow 何时更新权重和偏差?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

tensorflow 何时更新 for 循环中的权重和偏差?

When does tensorflow update weights and biases in the for loop?

以下是tf的github上的代码.mnist_softmax.py

Below is the code from tf's github. mnist_softmax.py

for _ in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})

  1. tensorflow 何时更新权重和偏差?
  2. 在运行 sess.run() 时是否更新它们?如果是这样,这是否意味着在这个程序中,tf 更新权重和偏差 1000 次?
  3. 还是在完成整个 for 循环后更新它们?
  4. 如果 2. 是正确的,我的下一个问题是,tf 是否每次都使用不同的训练数据更新模型(因为它使用 next_batch(100)).总共有 1000*100 个训练数据点.但所有数据点都是仅单独考虑一次.我是正确的还是误解了什么?
  5. 如果 3. 是正确的,那么在仅仅一个更新步骤之后模型就已经训练好是不是很奇怪?我想我一定是误解了一些东西,如果有人能给我提示或参考一些材料,那真是太好了.
  1. When does tensorflow update weights and biases?
  2. Does it update them when running sess.run()? If so, Does it mean, in this program, tf update weights and biases 1000 times?
  3. Or does it update them after finishing the whole for loop?
  4. If 2. is correct, my next question is, does tf update the model using different training data every time (since it uses next_batch(100). There are 1000*100 training data points in total. But all data points are considered only once individually. Am I correct or did I misunderstand something?
  5. If 3. is correct, is it weird that after just one update step the model had been trained? I think I must be misunderstanding something, It would be really great if anyone can give me a hint or refer some material.

推荐答案

  1. 每次运行 train_step 时它都会更新权重.
  2. 是的,它在这个程序中更新了 1000 次权重.
  3. 见上文
  4. 是的,你是对的,它一次加载一个包含 100 个点的小批量,并用它来计算梯度.
  5. 一点都不奇怪.您不一定需要一次又一次地查看相同的数据,所需要的只是您有足够的数据供网络收敛.如果需要,您可以对同一数据进行多次迭代,但由于此模型没有很多参数,因此它会在单个 epoch 内收敛.

Tensorflow 通过创建计算网络输出所需的计算图来工作.每个基本运算,如矩阵乘法、加法,你能想到的任何东西都是这个计算图中的节点.在您所关注的 tensorflow mnist 示例中,第 40-46 行定义了网络架构

Tensorflow works by creating a graph of the computations that are required for computing the output of a network. Each of the basic operations like matrix multiplication, addition, anything you can think of are nodes in this computation graph. In the tensorflow mnist example that you are following, the lines from 40-46 define the network architecture

  • x:占位符
  • y_:占位符
  • W:变量 - 这是在训练期间学习的
  • b:变量 - 这也是在训练期间学习的

该网络表示一个简单的线性回归模型,其中使用 y = W*x + b 进行预测(参见第 43 行).

The network represents a simple linear regression model where the prediction is made using y = W*x + b (see line 43).

接下来,为您的网络配置训练程序.这段代码使用交叉熵作为损失函数来最小化(见第 57 行).最小化是使用梯度下降算法完成的(见第 59 行).

Next, you configure the training procedure for your network. This code uses cross-entropy as the loss function to minimize (see line 57). The minimization is done using the gradient descent algorithm (see line 59).

此时,您的网络已完全构建.现在您需要运行这些节点,以便执行实际计算(到目前为止还没有执行任何计算).

At this point, your network is fully constructed. Now you need to run these nodes so that actual computation if performed (no computation has been performed up till this point).

在执行 sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) 的循环中,tf 计算 train_step 的值,导致GradientDescentOptimizer 尝试最小化cross_entropy,这就是训练的进展方式.

In the loop where sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) is executed, tf computes the value of train_step which causes the GradientDescentOptimizer to try to minimize the cross_entropy and this is how training progresses.

这篇关于Tensorflow 何时更新权重和偏差?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆