Tensorflow - 输入矩阵与批处理数据的 matmul [英] Tensorflow - matmul of input matrix with batch data
问题描述
我有一些由 input_x
表示的数据.它是一个未知大小的张量(应该是批量输入的),每个项目的大小为n
.input_x
经历了 tf.nn.embedding_lookup
,所以 embed
现在有维度 [?, n, m]
其中m
是嵌入大小,?
指的是未知的批量大小.
I have some data represented by input_x
. It is a tensor of unknown size (should be inputted by batch) and each item there is of size n
. input_x
undergoes tf.nn.embedding_lookup
, so that embed
now has dimensions [?, n, m]
where m
is the embedding size and ?
refers to the unknown batch size.
此处描述:
input_x = tf.placeholder(tf.int32, [None, n], name="input_x")
embed = tf.nn.embedding_lookup(W, input_x)
我现在尝试将输入数据中的每个样本(现在通过嵌入维度扩展)乘以矩阵变量 U
,但我似乎不知道该怎么做
I'm now trying to multiply each sample in my input data (which is now expanded by embedding dimension) by a matrix variable, U
, and I can't seem to get how to do that.
我第一次尝试使用 tf.matmul
但由于形状不匹配而出现错误.然后我尝试了以下操作,通过扩展 U
的维度并应用 batch_matmul
(我也尝试了 tf.nn.math_ops.
中的函数,结果是一样的):
I first tried using tf.matmul
but it gives an error due to mismatch in shapes. I then tried the following, by expanding the dimension of U
and applying batch_matmul
(I also tried the function from tf.nn.math_ops.
, the result was the same):
U = tf.Variable( ... )
U1 = tf.expand_dims(U,0)
h=tf.batch_matmul(embed, U1)
这通过了初始编译,但是当应用实际数据时,出现以下错误:
This passes the initial compilation, but then when actual data is applied, I get the following error:
In[0].dim(0) 和 In[1].dim(0) 必须相同:[64,58,128] vs [1,128,128]
我也知道为什么会这样 - 我复制了 U
的维度,现在是 1
,但是小批量大小,64
,不适合.
I also know why this is happening - I replicated the dimension of U
and it is now 1
, but the minibatch size, 64
, doesn't fit.
如何正确地对我的张量矩阵输入进行矩阵乘法(对于未知的批量大小)?
How can I do that matrix multiplication on my tensor-matrix input correctly (for unknown batch size)?
推荐答案
matmul 操作 仅适用于矩阵(二维张量).这里有两种主要的方法来做到这一点,都假设 U
是一个二维张量.
The matmul operation only works on matrices (2D tensors). Here are two main approaches to do this, both assume that U
is a 2D tensor.
将
embed
切片到 2D 张量中,然后将每个张量分别与U
相乘.使用tf.scan()
a> 像这样:
Slice
embed
into 2D tensors and multiply each of them withU
individually. This is probably easiest to do usingtf.scan()
like this:
h = tf.scan(lambda a, x: tf.matmul(x, U), embed)
另一方面,如果效率很重要,最好将 embed
重塑为二维张量,这样乘法就可以用单个 matmul
完成像这样:
On the other hand if efficiency is important it may be better to reshape embed
to be a 2D tensor so the multiplication can be done with a single matmul
like this:
embed = tf.reshape(embed, [-1, m])
h = tf.matmul(embed, U)
h = tf.reshape(h, [-1, n, c])
其中 c
是 U
中的列数.最后一次重塑将确保 h
是一个 3D 张量,其中第 0 维对应于批次,就像原始 x_input
和 embed
一样.
where c
is the number of columns in U
. The last reshape will make sure that h
is a 3D tensor where the 0th dimension corresponds to the batch just like the original x_input
and embed
.
这篇关于Tensorflow - 输入矩阵与批处理数据的 matmul的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!