tensorflow batch_matmul如何工作? [英] How does tensorflow batch_matmul work?

查看:488
本文介绍了tensorflow batch_matmul如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Tensorflow具有一个名为 batch_matmul 的函数高维张量.但是我很难理解它是如何工作的,也许部分是因为我很难想象它.

Tensorflow has a function called batch_matmul which multiplies higher dimensional tensors. But I'm having a hard time understanding how it works, perhaps partially because I'm having a hard time visualizing it.

我想做的是将矩阵乘以3D张量的每个切片,但是我不太了解张量a的形状. z是最里面的尺寸吗?以下哪项是正确的?

What I want to do is multiply a matrix by each slice of a 3D tensor, but I don't quite understand what the shape of tensor a is. Is z the innermost dimension? Which of the following is correct?

我最希望第一个是正确的-对我来说这是最直观的,并且在.eval()输出中很容易看到.但是我怀疑第二个是正确的.

I would most prefer the first to be correct -- it's most intuitive to me and easy to see in the .eval() output. But I suspect the second is correct.

Tensorflow表示batch_matmul执行:

Tensorflow says that batch_matmul performs:

out[..., :, :] = matrix(x[..., :, :]) * matrix(y[..., :, :])

那是什么意思?在我的示例中这意味着什么?什么乘以什么?为什么我没有按我预期的方式获得3D张量?

What does that mean? What does that mean in the context of my example? What is being multiplied with with what? And why aren't I getting a 3D tensor the way I expected?

推荐答案

您可以将其想象为对批次中的每个训练示例进行一次叠加.

You can imagine it as doing a matmul over each training example in the batch.

例如,如果您有两个具有以下尺寸的张量:

For example, if you have two tensors with the following dimensions:

a.shape = [100, 2, 5]
b.shape = [100, 5, 2]

,然后进行批处理tf.matmul(a, b),您的输出将具有[100, 2, 2]的形状.

and you do a batch tf.matmul(a, b), your output will have the shape [100, 2, 2].

100是您的批处理大小,其他两个维度是您的数据维度.

100 is your batch size, the other two dimensions are the dimensions of your data.

这篇关于tensorflow batch_matmul如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆