tensorflow batch_matmul如何工作? [英] How does tensorflow batch_matmul work?
问题描述
Tensorflow具有一个名为 batch_matmul 的函数高维张量.但是我很难理解它是如何工作的,也许部分是因为我很难想象它.
Tensorflow has a function called batch_matmul which multiplies higher dimensional tensors. But I'm having a hard time understanding how it works, perhaps partially because I'm having a hard time visualizing it.
我想做的是将矩阵乘以3D张量的每个切片,但是我不太了解张量a的形状. z是最里面的尺寸吗?以下哪项是正确的?
What I want to do is multiply a matrix by each slice of a 3D tensor, but I don't quite understand what the shape of tensor a is. Is z the innermost dimension? Which of the following is correct?
我最希望第一个是正确的-对我来说这是最直观的,并且在.eval()输出中很容易看到.但是我怀疑第二个是正确的.
I would most prefer the first to be correct -- it's most intuitive to me and easy to see in the .eval() output. But I suspect the second is correct.
Tensorflow表示batch_matmul执行:
Tensorflow says that batch_matmul performs:
out[..., :, :] = matrix(x[..., :, :]) * matrix(y[..., :, :])
那是什么意思?在我的示例中这意味着什么?什么乘以什么?为什么我没有按我预期的方式获得3D张量?
What does that mean? What does that mean in the context of my example? What is being multiplied with with what? And why aren't I getting a 3D tensor the way I expected?
推荐答案
您可以将其想象为对批次中的每个训练示例进行一次叠加.
You can imagine it as doing a matmul over each training example in the batch.
例如,如果您有两个具有以下尺寸的张量:
For example, if you have two tensors with the following dimensions:
a.shape = [100, 2, 5]
b.shape = [100, 5, 2]
,然后进行批处理tf.matmul(a, b)
,您的输出将具有[100, 2, 2]
的形状.
and you do a batch tf.matmul(a, b)
, your output will have the shape [100, 2, 2]
.
100是您的批处理大小,其他两个维度是您的数据维度.
100 is your batch size, the other two dimensions are the dimensions of your data.
这篇关于tensorflow batch_matmul如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!