使用Tensorflow.js计算损失的梯度 [英] Computing the gradient of the loss using Tensorflow.js

查看:66
本文介绍了使用Tensorflow.js计算损失的梯度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Tensorflow.js计算与网络可训练权重有关的损耗梯度,以便将这些梯度应用于网络的权重。在python中,可以使用tf.gradients()函数轻松完成此操作,该函数接受两个表示dx和dy的最小输入。
但是,我无法在Tensorflow.js中重现该行为。我不确定对权重梯度的错误理解是否正确,或者我的代码是否包含错误。

I am trying to compute the gradient of a loss, with relation to a network's trainable weights using Tensorflow.js in order to apply these gradients to my network's weight. In python, this is easily done using the tf.gradients() functions, which takes two minimum inputs representing dx and dy. However, I am not able to reproduce the behavior in Tensorflow.js. I am not sure wether my understanding of the gradient of the loss w.r.t the weights is wrong, or if my code contains mistakes.

我花了一些时间分析核心代码调用tf.model.fit()函数了解tfjs-node程序包是如何完成的,但到目前为止效果不佳。

I have spent some time analysing the core code of the tfjs-node package to understand how it is done when we call the function tf.model.fit(), but with little success so far.

let model = build_model(); //Two stacked dense layers followed by two parallel dense layers for the output
let loss = compute_loss(...); //This function returns a tf.Tensor of shape [1] containing the mean loss for the batch.
const f = () => loss;
const grad = tf.variableGrads(f);
grad(model.getWeights());

model.getWeights()函数返回一个tf.variable()数组,因此我假设函数将为每一层计算dL​​ / dW,稍后我可以将其应用于网络的权重,但是由于出现此错误,情况并非如此:

The model.getWeights() function returns an array of tf.variable(), so I assumed the function would compute dL/dW for each layer, which I could apply later to my network's weights, however, that's not quite the case as I get this error :

Error: Cannot compute gradient of y=f(x) with respect to x. Make sure that the f you passed encloses all operations that lead from x to y.

我不太明白这个错误是什么意思。
我应该如何使用Tensorflow.js计算损失的梯度(类似于Python中的tf.gradients())?

I don't quite understand what does this error means. How am I supposed to compute the gradient (analog to tf.gradients() in Python) of the loss using Tensorflow.js then ?

Edit:
这是计算损失的函数:

Edit : This is the function computing the loss :

function compute_loss(done, new_state, memory, agent, gamma=0.99) {
    let reward_sum = 0.;
    if(done) {
        reward_sum = 0.;
    } else {
        reward_sum = agent.call(tf.oneHot(new_state, 12).reshape([1, 9, 12]))
                    .values.flatten().get(0);
    }

    let discounted_rewards = [];
    let memory_reward_rev = memory.rewards;
    for(let reward of memory_reward_rev.reverse()) {
        reward_sum = reward + gamma * reward_sum;
        discounted_rewards.push(reward_sum);
    }
    discounted_rewards.reverse();

    let onehot_states = [];
    for(let state of memory.states) {
        onehot_states.push(tf.oneHot(state, 12));
    }
    let init_onehot = onehot_states[0];

    for(let i=1; i<onehot_states.length;i++) {
        init_onehot = init_onehot.concat(onehot_states[i]);
    }

    let log_val = agent.call(
        init_onehot.reshape([memory.states.length, 9, 12])
    );

    let disc_reward_tensor = tf.tensor(discounted_rewards);
    let advantage = disc_reward_tensor.reshapeAs(log_val.values).sub(log_val.values);
    let value_loss = advantage.square();
    log_val.values.print();

    let policy = tf.softmax(log_val.logits);
    let logits_cpy = log_val.logits.clone();

    let entropy = policy.mul(logits_cpy.mul(tf.scalar(-1))); 
    entropy = entropy.sum();

    let memory_actions = [];
    for(let i=0; i< memory.actions.length; i++) {
        memory_actions.push(new Array(2000).fill(0));
        memory_actions[i][memory.actions[i]] = 1;
    }
    memory_actions = tf.tensor(memory_actions);
    let policy_loss = tf.losses.softmaxCrossEntropy(memory_actions.reshape([memory.actions.length, 2000]), log_val.logits);

    let value_loss_copy = value_loss.clone();
    let entropy_mul = (entropy.mul(tf.scalar(0.01))).mul(tf.scalar(-1));
    let total_loss_1 = value_loss_copy.mul(tf.scalar(0.5, dtype='float32'));

    let total_loss_2 = total_loss_1.add(policy_loss);
    let total_loss = total_loss_2.add(entropy_mul);
    total_loss.print();
    return total_loss.mean();

}

编辑2:

我设法使用compute_loss作为在model.compile()上指定的损失函数。但是然后,它仅需要两个输入(预测,标签),所以对我来说不起作用,因为我想输入多个参数。

I managed to use the compute_loss as the loss function specified on model.compile(). But then, it is required that it takes only two inputs (predictions, labels), so it's not working out for me, as I want to input multiple parameters.

I

推荐答案

错误说明了一切。
您的问题与tf.variableGrads有关。 损失应该是使用所有可用的 tf 张量运算符计算的标量。 损失不应返回您的问题所指示的张量。

The error says it all. Your issue has to do with tf.variableGrads. loss should be a scalar computed using all available tf tensors operators. loss should not return a tensor as indicated in your question.

以下是应说明损失的示例:

Here is an example of what loss should be:

const a = tf.variable(tf.tensor1d([3, 4]));
const b = tf.variable(tf.tensor1d([5, 6]));
const x = tf.tensor1d([1, 2]);

const f = () => a.mul(x.square()).add(b.mul(x)).sum(); // f is a function
// df/da = x ^ 2, df/db = x 
const {value, grads} = tf.variableGrads(f); // gradient of f as respect of each variable

Object.keys(grads).forEach(varName => grads[varName].print());

/!\请注意,梯度是根据使用 tf.variable

/!\ Notice that the gradient is calculated as respect of variables created using tf.variable

更新:

您不计算梯度应该是。这是解决方法。

You're not computing the gradients as it should be. Here is the fix.

function compute_loss(done, new_state, memory, agent, gamma=0.99) {
    const f = () => { let reward_sum = 0.;
    if(done) {
        reward_sum = 0.;
    } else {
        reward_sum = agent.call(tf.oneHot(new_state, 12).reshape([1, 9, 12]))
                    .values.flatten().get(0);
    }

    let discounted_rewards = [];
    let memory_reward_rev = memory.rewards;
    for(let reward of memory_reward_rev.reverse()) {
        reward_sum = reward + gamma * reward_sum;
        discounted_rewards.push(reward_sum);
    }
    discounted_rewards.reverse();

    let onehot_states = [];
    for(let state of memory.states) {
        onehot_states.push(tf.oneHot(state, 12));
    }
    let init_onehot = onehot_states[0];

    for(let i=1; i<onehot_states.length;i++) {
        init_onehot = init_onehot.concat(onehot_states[i]);
    }

    let log_val = agent.call(
        init_onehot.reshape([memory.states.length, 9, 12])
    );

    let disc_reward_tensor = tf.tensor(discounted_rewards);
    let advantage = disc_reward_tensor.reshapeAs(log_val.values).sub(log_val.values);
    let value_loss = advantage.square();
    log_val.values.print();

    let policy = tf.softmax(log_val.logits);
    let logits_cpy = log_val.logits.clone();

    let entropy = policy.mul(logits_cpy.mul(tf.scalar(-1))); 
    entropy = entropy.sum();

    let memory_actions = [];
    for(let i=0; i< memory.actions.length; i++) {
        memory_actions.push(new Array(2000).fill(0));
        memory_actions[i][memory.actions[i]] = 1;
    }
    memory_actions = tf.tensor(memory_actions);
    let policy_loss = tf.losses.softmaxCrossEntropy(memory_actions.reshape([memory.actions.length, 2000]), log_val.logits);

    let value_loss_copy = value_loss.clone();
    let entropy_mul = (entropy.mul(tf.scalar(0.01))).mul(tf.scalar(-1));
    let total_loss_1 = value_loss_copy.mul(tf.scalar(0.5, dtype='float32'));

    let total_loss_2 = total_loss_1.add(policy_loss);
    let total_loss = total_loss_2.add(entropy_mul);
    total_loss.print();
    return total_loss.mean().asScalar();
}

return tf.variableGrads(f);
}

请注意,您可能会很快遇到内存消耗问题。建议使用 tf.tidy 包围的差分函数来处理张量。

Notice that you can quickly run into a memory consumption issue. It will advisable to surround the function differentiated with tf.tidy to dispose of the tensors.

这篇关于使用Tensorflow.js计算损失的梯度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆