在Keras中具有可变批量大小的batch_dot [英] batch_dot with variable batch size in Keras

查看:345
本文介绍了在Keras中具有可变批量大小的batch_dot的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个图层,以将2个张量与这样的公式

I'm trying to writting a layer to merge 2 tensors with such a formula

x [0]和x [1]的形状均为(?,1,500).

The shapes of x[0] and x[1] are both (?, 1, 500).

M是500 * 500矩阵.

M is a 500*500 Matrix.

我希望输出为(?,500,500),我认为这在理论上是可行的.对于每对输入,该层将输出(1,500,500),分别为(1、1、500)和(1、1、1,500).由于batch_size是可变的或动态的,因此输出必须为(?,500,500).

I want the output to be (?, 500, 500) which is theoretically feasible in my opinion. The layer will output (1,500,500) for every pair of inputs, as (1, 1, 500) and (1, 1, 500). As the batch_size is variable, or dynamic, the output must be (?, 500, 500).

但是,我对轴知之甚少,我已经尝试了所有轴的组合,但这没有任何意义.

However, I know little about axes and I have tried all the combinations of axes but it doesn't make sense.

我尝试使用numpy.tensordot和keras.backend.batch_dot(TensorFlow).如果batch_size是固定的,则取= (100,1,500),例如batch_dot(a,M,(2,0)),输出可以是(100,1,500).

I try with numpy.tensordot and keras.backend.batch_dot(TensorFlow). If the batch_size is fixed, taking a = (100,1,500) for example, batch_dot(a,M,(2,0)), the output can be (100,1,500).

对于Keras的新手,很抱歉提出这样一个愚蠢的问题,但我花了2天的时间才弄清楚,这让我发疯了:(

Newbie for Keras, sorry for such a stupid question but I have spent 2 days to figure out and it drove me crazy :(

    def call(self,x):
            input1 = x[0]
            input2 = x[1]
            #self.M is defined in build function
            output = K.batch_dot(...)
            return output

更新:

很抱歉迟到了.我尝试将TensorFlow用作Daniel的答案,并将其作为Keras的后端,但是对于不相等的尺寸,它仍然会引发ValueError.

Sorry for being late. I try Daniel's answer with TensorFlow as Keras's backend and it still raises a ValueError for unequal dimensions.

我尝试使用Theano作为后端使用相同的代码,现在它可以工作了.

I try the same code with Theano as backend and now it works.

>>> import numpy as np
>>> import keras.backend as K
Using Theano backend.
>>> from keras.layers import Input
>>> x1 = Input(shape=[1,500,])
>>> M = K.variable(np.ones([1,500,500]))
>>> firstMul = K.batch_dot(x1, M, axes=[1,2])

我不知道如何在theano中打印张量的形状.对我来说,这绝对比tensorflow难...但是它有效.

I don't know how to print tensors' shape in theano. It's definitely harder than tensorflow for me... However it works.

为此,我扫描了Tensorflow和Theano的2个版本的代码.以下是区别.

For that I scan 2 versions of codes for Tensorflow and Theano. Following are differences.

在这种情况下,x =(?,1,500),y =(1,500,500),轴= [1,2]

In this case, x = (?, 1, 500), y = (1, 500, 500), axes = [1, 2]

在tensorflow_backend中:

In tensorflow_backend:

return tf.matmul(x, y, adjoint_a=True, adjoint_b=True)

在theano_backend中:

In theano_backend:

return T.batched_tensordot(x, y, axes=axes)

(如果遵循out._keras_shape的更改不会影响out的值.)

(If following changes of out._keras_shape don't make influence on out's value.)

推荐答案

您的乘法应选择在批处理点函数中使用的轴.

Your multiplications should select which axes it uses in the batch dot function.

  • 轴0-批次尺寸,即您的?
  • 轴1-您说的尺寸长度为1
  • 第2轴-最后一个尺寸为500
  • 的尺寸
  • Axis 0 - the batch dimension, it's your ?
  • Axis 1 - the dimension you say has length 1
  • Axis 2 - the last dimension, of size 500

您不会更改批次尺寸,因此将始终在轴= [1,2]时使用batch_dot

You won't change the batch dimension, so you will use batch_dot always with axes=[1,2]

但是要使其正常工作,您必须将M设置为(?,500,500).
为此,请不要将M定义为(500,500),而应定义为(1,500,500),并在第一个轴上将其重复以批量大小:

But for that to work, you must ajust M to be (?, 500, 500).
For that define M not as (500,500), but as (1,500,500) instead, and repeat it in the first axis for the batch size:

import keras.backend as K

#Being M with shape (1,500,500), we repeat it.   
BatchM = K.repeat_elements(x=M,rep=batch_size,axis=0)
#Not sure if repeating is really necessary, leaving M as (1,500,500) gives the same output shape at the end, but I haven't checked actual numbers for correctness, I believe it's totally ok. 

#Now we can use batch dot properly:
firstMul = K.batch_dot(x[0], BatchM, axes=[1,2]) #will result in (?,500,500)

#we also need to transpose x[1]:
x1T = K.permute_dimensions(x[1],(0,2,1))

#and the second multiplication:
result = K.batch_dot(firstMul, x1T, axes=[1,2])

这篇关于在Keras中具有可变批量大小的batch_dot的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆