TensorFlow:'ValueError:没有为任何变量提供渐变' [英] TensorFlow: 'ValueError: No gradients provided for any variable'

查看:81
本文介绍了TensorFlow:'ValueError:没有为任何变量提供渐变'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在tensorflow中实现DeepMind的DQN算法,并在我调用 optimizer.minimize(self.loss)

I'm implementing DeepMind's DQN algorithm in tensorflow and running into this error on my the line where I call optimizer.minimize(self.loss):

ValueError:没有为任何变量提供渐变...

通过阅读有关此错误的其他文章,我了解到,这意味着损失函数不依赖于用于建立模型的任何张量,但是在我的代码中,我看不到这是怎么回事。 qloss()函数显然取决于对 predict()函数的调用,该函数取决于所有层张量进行计算。

From reading other posts about this error I've gathered that it means that the loss function doesn't depend on any of the tensors used to set up the model, but in my code I can't see how that could be. The qloss() function clearly depends on a call to the predict() function, which depends on all of the layer tensors to make its calculations.

可以在此处查看模型设置代码

推荐答案

问题是,在我的 qloss()函数中,我从张量中拉出值,对其进行运算并返回值。尽管值确实取决于张量,但它们本身并未封装在张量中,因此TensorFlow不能说它们取决于图中的张量。

I figured out that the issue was that, in my qloss() function I was pulling values out of the tensors, doing operations on them and returning the values. While the values did depend on the tensors, they were't incapsulated in tensors themselves, so TensorFlow couldn't tell that they depended on the tensors in the graph.

I通过更改 qloss()来解决此问题,使其直接在张量上进行运算并返回张量。这是新功能:

I fixed this by changing qloss() so that it did operations directly on the tensors and returned a tensor. Here's the new function:

def qloss(actions, rewards, target_Qs, pred_Qs):
    """
    Q-function loss with target freezing - the difference between the observed
    Q value, taking into account the recently received r (while holding future
    Qs at target) and the predicted Q value the agent had for (s, a) at the time
    of the update.

    Params:
    actions   - The action for each experience in the minibatch
    rewards   - The reward for each experience in the minibatch
    target_Qs - The target Q value from s' for each experience in the minibatch
    pred_Qs   - The Q values predicted by the model network

    Returns: 
    A list with the Q-function loss for each experience clipped from [-1, 1] 
    and squared.
    """
    ys = rewards + DISCOUNT * target_Qs

    #For each list of pred_Qs in the batch, we want the pred Q for the action
    #at that experience. So we create 2D list of indeces [experience#, action#]
    #to filter the pred_Qs tensor.
    gather_is = tf.squeeze(np.dstack([tf.range(BATCH_SIZE), actions]))
    action_Qs = tf.gather_nd(pred_Qs, gather_is)

    losses = ys - action_Qs
    clipped_squared_losses = tf.square(tf.minimum(tf.abs(losses), 1))

    return clipped_squared_losses

这篇关于TensorFlow:'ValueError:没有为任何变量提供渐变'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆