如何在不使用 tf.assign 的情况下为 TensorFlow 中的 tf.Variable 赋值 [英] How to assign a value to a tf.Variable in TensorFlow without using tf.assign

查看:23
本文介绍了如何在不使用 tf.assign 的情况下为 TensorFlow 中的 tf.Variable 赋值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 4x4 恒等矩阵的变量.我希望为这个矩阵分配一些值(这些值是由模型学习的).

I have a variable that contains the 4x4 identitiy matrix. I wish to assign some values to this matrix (these values are learned by the model).

当我使用 tf.assign() 时,我收到一条错误消息,指出跨步切片没有渐变.我的问题是如何在不使用 tf.assign()

When I use tf.assign() I get an error saying that strided slices do not have gradients. My question is how can I do this without using tf.assign()

这是所需行为的示例代码(没有错误,因为此处未学习值):

Here is a sample code of the desired behaviour(without the error, since the values are not learned here) :

params = [[1.0, 2.0, 3.0]]
M = tf.Variable(tf.eye(4, batch_shape=[1]), dtype=tf.float32)
M = tf.assign(M[:, 0:3, 3], params)

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
output_val = sess.run(M)

注意 - 创建该变量仅用于容纳这些参数.

Note - the variable is created solely for the purpose of housing these parameters.

更新:我正在添加一个创建错误的最小工作示例.(显然这样的训练不会产生任何好处.它只是为了说明错误,因为我的代码太长,无法复制到这里)

UPDATE: I am adding a minimal working example that creates the error. (obviously training like this won't result in anything good. Its just to illustrate the error since my code is far too long to copy here)

params = [[1.0, 2.0, 3.0]]
M_gt = np.eye(4)
M_gt[0:3, 3] = [4.0, 5.0, 6.0]

M = tf.Variable(tf.eye(4, batch_shape=[1]), dtype=tf.float32)
M = tf.assign(M[:, 0:3, 3], params)

loss = tf.nn.l2_loss(M - M_gt)
optimizer = tf.train.AdamOptimizer(0.001)
train_op = optimizer.minimize(loss)
sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)
sess.run(train_op)

推荐答案

以下是一个示例,说明您可以如何做(我认为)您想做的事:

Here is an example of how you could do what (I think) you want:

import tensorflow as tf
import numpy as np

with tf.Graph().as_default(), tf.Session() as sess:
    params = [[1.0, 2.0, 3.0]]
    M_gt = np.eye(4)
    M_gt[0:3, 3] = [4.0, 5.0, 6.0]

    M = tf.Variable(tf.eye(4, batch_shape=[1]), dtype=tf.float32)
    params_t = tf.constant(params, dtype=tf.float32)

    shape_m = tf.shape(M)
    batch_size = shape_m[0]
    num_m = shape_m[1]
    num_params = tf.shape(params_t)[1]

    last_column = tf.concat([tf.tile(tf.transpose(params_t)[tf.newaxis], (batch_size, 1, 1)),
                             tf.zeros((batch_size, num_m - num_params, 1), dtype=params_t.dtype)], axis=1)
    replace = tf.concat([tf.zeros((batch_size, num_m, num_m - 1), dtype=params_t.dtype), last_column], axis=2)

    r = tf.range(num_m)
    ii = r[tf.newaxis, :, tf.newaxis]
    jj = r[tf.newaxis, tf.newaxis, :]
    mask = tf.tile((ii < num_params) & (tf.equal(jj, num_m - 1)), (batch_size, 1, 1))
    M_replaced = tf.where(mask, replace, M)

    loss = tf.nn.l2_loss(M_replaced - M_gt[np.newaxis])
    optimizer = tf.train.AdamOptimizer(0.001)
    train_op = optimizer.minimize(loss)
    sess = tf.Session()
    init = tf.global_variables_initializer()
    sess.run(init)
    M_val, M_replaced_val = sess.run([M, M_replaced])
    print('M:')
    print(M_val)
    print('M_replaced:')
    print(M_replaced_val)

输出:

M:
[[[ 1.  0.  0.  0.]
  [ 0.  1.  0.  0.]
  [ 0.  0.  1.  0.]
  [ 0.  0.  0.  1.]]]
M_replaced:
[[[ 1.  0.  0.  1.]
  [ 0.  1.  0.  2.]
  [ 0.  0.  1.  3.]
  [ 0.  0.  0.  1.]]]

这篇关于如何在不使用 tf.assign 的情况下为 TensorFlow 中的 tf.Variable 赋值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆