使用Keras/Tensorflow模仿PyTorch切片任务的最佳方法 [英] Best way to mimic PyTorch sliced assignment with Keras/Tensorflow

查看:171
本文介绍了使用Keras/Tensorflow模仿PyTorch切片任务的最佳方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试模仿以下在PyTorch中完成的操作:

I am trying to mimic the operation done in PyTorch below:

vol = Variable(torch.FloatTensor(A, B*2, C, D, E).zero_()).cuda()
for i in range(C):
  if i > 0 :
    vol[:, :B, i, :,i:] = input0[:,:,:,i:]
    vol[:, B:, i, :,i:] = input1[:,:,:,:-i]
  else:
    vol[:, :B, i, :,:] = input0
    vol[:, B:, i, :,:] = input1

到目前为止,我已经尝试在TF中使用以下切片分配并将其包装在Keras Lambda层中:

So far, I have tried using the following sliced assignment in TF and wrapping it in a Keras Lambda layer:

vol = tf.Variable(K.zeros((A, D, E, C, B*2)))
for i in range(C):
  if i > 0:
    vol[:, :, i:, i, :B].assign(input0[:,:,i:,:])
    vol[:, :, i:, i, B:].assign(input1[:,:,:-i,:])
  else:
    vol[:, :, :, i, :B].assign(input0)
    vol[:, :, :, i, B:].assign(input1)
return vol

我也尝试过vol = vol[...].assign(...).

这将值正确分配给<​​c1>变量,然后我可以将其转换为张量以在图形的其余部分中使用.但是,此操作的梯度在TF(LookupError: No gradient defined for operation 'strided_slice/_assign' (op type: StridedSliceAssign))中未定义,并且该梯度不会传播到生成input0input1的先前层,尽管它们确实在PyTorch实现中被转移了.有没有一种方法可以在TF中构造相同的变量,从而定义渐变并且我以前的操作没有None渐变?

This assigns the values to the vol variable correctly, which I can then convert to a tensor to use in the rest of my graph. However, the gradient of this operation is undefined in TF (LookupError: No gradient defined for operation 'strided_slice/_assign' (op type: StridedSliceAssign)), and the gradient doesn't get propagated to the previous layers that generate input0 and input1, while they do appear to get transferred in the PyTorch implementation. Is there a way to construct this same variable in TF such that the gradient is defined and my previous operations don't have a None gradient?

推荐答案

您需要手动"构造张量.假设input0input1都具有形状(ADEB),则可以执行以下操作:

You need to construct the tensor "by hand". Assuming both input0 and input1 have shape (A, D, E, B), you can do something like this:

# Make the indexing mask with TensorFlow
in_shape = tf.shape(input0)
in_dims = 4
idx = tf.meshgrid(*[tf.range(in_shape[i]) for i in range(in_dims)], indexing='ij')[2]
idx = tf.expand_dims(idx, axis=3)
r = tf.range(C)[tf.newaxis, tf.newaxis, tf.newaxis, :, tf.newaxis]
mask = idx >= r

# If all dimensions are known at graph construction time, you can instead
# make the mask with NumPy like this to save graph computation time
idx = np.meshgrid(*[np.arange(d) for d in (A, D, E, B)], indexing='ij')[2]
idx = np.expand_dims(idx, 3)
r = np.arange(C)[np.newaxis, np.newaxis, np.newaxis, :, np.newaxis]
mask = idx >= r

# Make the tensor
input0_tile = tf.tile(tf.expand_dims(input0, 3), (1, 1, 1, C, 1))
input1_tile = tf.tile(tf.expand_dims(input1, 3), (1, 1, 1, C, 1))
zero_tile = tf.zeros_like(input0_tile)
vol0 = np.where(mask, input0_tile, zero_tile)
vol1 = np.where(mask, input1_tile, zero_tile)
vol = tf.concat([vol0, vol1], axis=-1)

请注意,您需要第一个或第二个块,然后是第三个块,而不是三个块(请参见注释).该代码使用 tf.meshgrid 索引的tf.range ,然后使用

Note that you need either the first or the second block followed by the third block, not the three blocks (see comments). The code builds a binary mask using a tf.meshgrid and a tf.range of indices, then uses tf.where to select values from the inputs or zeros.

这篇关于使用Keras/Tensorflow模仿PyTorch切片任务的最佳方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆