为什么在 TensorFlow 中获取模型参数的值和重新分配新值需要越来越长的时间? [英] Why getting values of model parameters and reassigning of new values takes longer and longer in TensorFlow?

查看:28
本文介绍了为什么在 TensorFlow 中获取模型参数的值和重新分配新值需要越来越长的时间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 Python 函数,它接受 TensorFlow 会话、符号变量(表示模型参数的张量、模型参数的梯度).我在循环中调用此函数,每次后续调用都花费越来越长的时间.所以,我想知道这可能是什么原因.

I have a Python function that takes TensorFlow session, symbolic variables (tensors representing parameters of the model, gradients of the model parameters). I call this function in a loop and each subsequent call takes longer and longer. So, I wonder what might be the reason for that.

函数代码如下:

def minimize_step(s, params, grads, min_lr, factor, feed_dict, score):
    '''
    Inputs:
        s - TensorFlow session
        params - list of nodes representing model parameters
        grads - list of nodes representing gradients of parameters
        min_lr - startning learnig rate
        factor - growth factor for the learning rate
        feed_dict - feed dictionary used to evaluate gradients and score
            Normally it contains X and Y
        score - score that is minimized

    Result:
        One call of this function makes an update of model parameters.
    '''
    ini_vals = [s.run(param) for param in params]
    grad_vals = [s.run(grad, feed_dict = feed_dict) for grad in grads]
    lr = min_lr
    best_score = None
    while True:
        new_vals = [ini_val - lr * grad for ini_val, grad in zip(ini_vals, grad_vals)]
        for i in range(len(new_vals)):
            s.run(tf.assign(params[i], new_vals[i]))
        score_val = s.run(score, feed_dict = feed_dict)
        if best_score == None or score_val < best_score:
            best_score = score_val
            best_lr = lr
            best_params = new_vals[:]
        else:
            for i in range(len(new_vals)):
                s.run(tf.assign(params[i], best_params[i]))
            break
        lr *= factor
    return best_score, best_lr

会不会是代表模型参数的符号变量以某种方式累积了旧的旧值?

Could it be that the symbolic variables, representing model parameters, somehow accumulate old old values?

推荐答案

您似乎忽略了如何使用 tensorflow 1.* 的重点.我不在这里详细介绍,因为您可以在互联网上找到大量资源.我认为这篇论文足以理解如何使用tensorflow 1的概念.*.

It seems that you are missing the point on how tensorflow 1.* is used. I'm not going into details here since you could find plenty of resources on the internet. I think this paper would be enough to understand the concept of how to use tensorflow 1.*.

在您的示例中,每次迭代都会不断向图中添加新节点.

In your example at every iteration you are continuously adding new nodes to the graph.

假设这是您的执行图

import tensorflow as tf
import numpy as np

x = tf.placeholder(tf.float32, (None, 2))
y = tf.placeholder(tf.int32, (None))

res = tf.keras.layers.Dense(2)(x)

xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
    logits=res, labels=y)
loss_tensor = tf.reduce_mean(xentropy)

lr = tf.placeholder(tf.float32, ())
grads = tf.gradients(loss_tensor, tf.trainable_variables())
weight_updates = [tf.assign(w, w - lr * g) for g, w in zip(grads, tf.trainable_variables())]

每次执行 weight_updates 时,模型的权重都会更新.

Each time the weight_updates are executed the weights of the model will be updated.

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    # before
    print(sess.run(tf.trainable_variables()))
#     [array([[ 0.7586721 , -0.7465675 ],
#             [-0.34097505, -0.83986187]], dtype=float32), array([0., 0.], dtype=float32)]
    # after
    evaluated = sess.run(weight_updates,
                         {x: np.random.normal(0, 1, (2, 2)),
                          y: np.random.randint(0, 2, 2),
                          lr: 0.001})
    print(evaluated)
#     [array([[-1.0437444 , -0.7132262 ],
#             [-0.8282471 , -0.01127395]], dtype=float32), array([ 0.00072743, -0.00072743], dtype=float32)]

在您的示例中,您在每个步骤中都向图中添加了额外的执行流,而不是使用现有的执行流.

In your example at each step you are adding additional execution flow to the graph instead of using existing one.

这篇关于为什么在 TensorFlow 中获取模型参数的值和重新分配新值需要越来越长的时间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆