为什么在 TensorFlow 中获取模型参数的值和重新分配新值需要越来越长的时间? [英] Why getting values of model parameters and reassigning of new values takes longer and longer in TensorFlow?
问题描述
我有一个 Python 函数,它接受 TensorFlow 会话、符号变量(表示模型参数的张量、模型参数的梯度).我在循环中调用此函数,每次后续调用都花费越来越长的时间.所以,我想知道这可能是什么原因.
I have a Python function that takes TensorFlow session, symbolic variables (tensors representing parameters of the model, gradients of the model parameters). I call this function in a loop and each subsequent call takes longer and longer. So, I wonder what might be the reason for that.
函数代码如下:
def minimize_step(s, params, grads, min_lr, factor, feed_dict, score):
'''
Inputs:
s - TensorFlow session
params - list of nodes representing model parameters
grads - list of nodes representing gradients of parameters
min_lr - startning learnig rate
factor - growth factor for the learning rate
feed_dict - feed dictionary used to evaluate gradients and score
Normally it contains X and Y
score - score that is minimized
Result:
One call of this function makes an update of model parameters.
'''
ini_vals = [s.run(param) for param in params]
grad_vals = [s.run(grad, feed_dict = feed_dict) for grad in grads]
lr = min_lr
best_score = None
while True:
new_vals = [ini_val - lr * grad for ini_val, grad in zip(ini_vals, grad_vals)]
for i in range(len(new_vals)):
s.run(tf.assign(params[i], new_vals[i]))
score_val = s.run(score, feed_dict = feed_dict)
if best_score == None or score_val < best_score:
best_score = score_val
best_lr = lr
best_params = new_vals[:]
else:
for i in range(len(new_vals)):
s.run(tf.assign(params[i], best_params[i]))
break
lr *= factor
return best_score, best_lr
会不会是代表模型参数的符号变量以某种方式累积了旧的旧值?
Could it be that the symbolic variables, representing model parameters, somehow accumulate old old values?
推荐答案
您似乎忽略了如何使用 tensorflow 1.* 的重点.我不在这里详细介绍,因为您可以在互联网上找到大量资源.我认为这篇论文足以理解如何使用tensorflow 1的概念.*.
It seems that you are missing the point on how tensorflow 1.* is used. I'm not going into details here since you could find plenty of resources on the internet. I think this paper would be enough to understand the concept of how to use tensorflow 1.*.
在您的示例中,每次迭代都会不断向图中添加新节点.
In your example at every iteration you are continuously adding new nodes to the graph.
假设这是您的执行图
import tensorflow as tf
import numpy as np
x = tf.placeholder(tf.float32, (None, 2))
y = tf.placeholder(tf.int32, (None))
res = tf.keras.layers.Dense(2)(x)
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
logits=res, labels=y)
loss_tensor = tf.reduce_mean(xentropy)
lr = tf.placeholder(tf.float32, ())
grads = tf.gradients(loss_tensor, tf.trainable_variables())
weight_updates = [tf.assign(w, w - lr * g) for g, w in zip(grads, tf.trainable_variables())]
每次执行 weight_updates
时,模型的权重都会更新.
Each time the weight_updates
are executed the weights of the model will be updated.
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# before
print(sess.run(tf.trainable_variables()))
# [array([[ 0.7586721 , -0.7465675 ],
# [-0.34097505, -0.83986187]], dtype=float32), array([0., 0.], dtype=float32)]
# after
evaluated = sess.run(weight_updates,
{x: np.random.normal(0, 1, (2, 2)),
y: np.random.randint(0, 2, 2),
lr: 0.001})
print(evaluated)
# [array([[-1.0437444 , -0.7132262 ],
# [-0.8282471 , -0.01127395]], dtype=float32), array([ 0.00072743, -0.00072743], dtype=float32)]
在您的示例中,您在每个步骤中都向图中添加了额外的执行流,而不是使用现有的执行流.
In your example at each step you are adding additional execution flow to the graph instead of using existing one.
这篇关于为什么在 TensorFlow 中获取模型参数的值和重新分配新值需要越来越长的时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!