torch.einsum 如何执行这个 4D 张量乘法? [英] How does torch.einsum perform this 4D tensor multiplication?
问题描述
我遇到了一个使用 torch.einsum
计算张量乘法的代码.我能够理解低阶张量的工作原理,但是对于4D张量如下:
I have come across a code which uses torch.einsum
to compute a tensor multiplication. I am able to understand the workings for lower order tensors, but, not for the 4D tensor as below:
import torch
a = torch.rand((3, 5, 2, 10))
b = torch.rand((3, 4, 2, 10))
c = torch.einsum('nxhd,nyhd->nhxy', [a,b])
print(c.size())
# output: torch.Size([3, 2, 5, 4])
我需要以下方面的帮助:
I need help regarding:
- 这里执行的操作是什么(解释矩阵如何相乘/转置等)?
torch.einsum
在这种情况下真的有用吗?
- What is the operation that has been performed here (explanation for how the matrices were multiplied/transposed etc.)?
- Is
torch.einsum
actually beneficial in this scenario?
推荐答案
(如果您只想了解 einsum 中涉及的步骤的细分,请跳到 tl;dr 部分)
我将尝试解释 einsum
如何在此示例中逐步工作,但我将使用 numpy,而不是使用
(documentation),这正是一样,但总的来说,我对它更满意.尽管如此,同样的步骤也适用于 Torch.torch.einsum
.einsum
I'll try to explain how einsum
works step by step for this example but instead of using torch.einsum
, I'll be using numpy.einsum
(documentation), which does exactly the same but I am just, in general, more comfortable with it. Nonetheless, the same steps happen for torch as well.
我们用NumPy重写上面的代码-
Let's rewrite the above code in NumPy -
import numpy as np
a = np.random.random((3, 5, 2, 10))
b = np.random.random((3, 4, 2, 10))
c = np.einsum('nxhd,nyhd->nhxy', a,b)
c.shape
#(3, 2, 5, 4)
一步一步的 np.einsum
Einsum 由 3 个步骤组成:multiply
、sum
和 transpose
让我们看看我们的尺寸.我们有一个 (3, 5, 2, 10)
和一个 (3, 4, 2, 10)
我们需要把它们带到 (3, 2, 5, 4)
基于 'nxhd,nyhd->nhxy'
Let's look at our dimensions. We have a (3, 5, 2, 10)
and a (3, 4, 2, 10)
that we need to bring to (3, 2, 5, 4)
based on 'nxhd,nyhd->nhxy'
让我们不用担心 n,x,y,h,d
轴的顺序,只要担心是要保留它们还是删除(减少)它们.把它们写成一张表格,看看我们如何安排我们的维度 -
Let's not worry about the order in which the n,x,y,h,d
axes is, and just worry about the fact if you want to keep them or remove (reduce) them. Writing them down as a table and see how we can arrange our dimensions -
## Multiply ##
n x y h d
--------------------
a -> 3 5 2 10
b -> 3 4 2 10
c1 -> 3 5 4 2 10
要获得 x
和 y
轴之间的广播乘法以产生 (x, y)
,我们必须添加一个新的轴在正确的位置,然后相乘.
To get the broadcasting multiplication between x
and y
axis to result in (x, y)
, we will have to add a new axis at the right places and then multiply.
a1 = a[:,:,None,:,:] #(3, 5, 1, 2, 10)
b1 = b[:,None,:,:,:] #(3, 1, 4, 2, 10)
c1 = a1*b1
c1.shape
#(3, 5, 4, 2, 10) #<-- (n, x, y, h, d)
2.求和/减少
接下来,我们要减少最后一个轴 10.这将得到维度 (n,x,y,h)
.
## Reduce ##
n x y h d
--------------------
c1 -> 3 5 4 2 10
c2 -> 3 5 4 2
这很简单.让我们在 axis=-1
c2 = np.sum(c1, axis=-1)
c2.shape
#(3,5,4,2) #<-- (n, x, y, h)
3.转置
最后一步是使用转置重新排列轴.我们可以为此使用 np.transpose
.np.transpose(0,3,1,2)
基本上是在第 0 轴之后带来第 3 轴并推动第 1 和第 2 轴.所以,(n,x,y,h)
变成了 (n,h,x,y)
3. Transpose
The last step is rearranging the axis using a transpose. We can use np.transpose
for this. np.transpose(0,3,1,2)
basically brings the 3rd axis after the 0th axis and pushes the 1st and 2nd. So, (n,x,y,h)
becomes (n,h,x,y)
c3 = c2.transpose(0,3,1,2)
c3.shape
#(3,2,5,4) #<-- (n, h, x, y)
4.最后检查
让我们做最后的检查,看看 c3 是否与从 np.einsum
生成的 c 相同 -
4. Final check
Let's do a final check and see if c3 is the same as the c which was generated from the np.einsum
-
np.allclose(c,c3)
#True
TL;博士.
因此,我们已经实现了 'nxhd , nyhd ->nhxy'
as -
input -> nxhd, nyhd
multiply -> nxyhd #broadcasting
sum -> nxyh #reduce
transpose -> nhxy
优势
np.einsum
在所采取的多个步骤中的优势在于,您可以选择路径";它需要进行计算并使用相同的功能执行多个操作.这可以通过 optimize
参数来完成,这将优化 einsum 表达式的收缩顺序.
Advantage
Advantage of np.einsum
over the multiple steps taken, is that you can choose the "path" that it takes to do the computation and perform multiple operations with the same function. This can be done by optimize
paramter, which will optimize the contraction order of an einsum expression.
可以由 einsum
计算的这些操作的非详尽列表和示例如下所示:
A non-exhaustive list of these operations, which can be computed by einsum
, is shown below along with examples:
- 数组的跟踪,
numpy.trace
. - 返回对角线,
numpy.diag
. - 数组轴求和,
numpy.sum
. - 换位和排列,
numpy.transpose
. - 矩阵乘法和点积,
numpy.matmul
numpy.dot
. - 向量内积和外积,
numpy.inner
numpy.outer
. - 广播,元素和标量乘法,
numpy.multiply
. - 张量收缩,
numpy.tensordot
. - 链式数组操作,计算顺序低效,
numpy.einsum_path
.
- Trace of an array,
numpy.trace
. - Return a diagonal,
numpy.diag
. - Array axis summations,
numpy.sum
. - Transpositions and permutations,
numpy.transpose
. - Matrix multiplication and dot product,
numpy.matmul
numpy.dot
. - Vector inner and outer products,
numpy.inner
numpy.outer
. - Broadcasting, element-wise and scalar multiplication,
numpy.multiply
. - Tensor contractions,
numpy.tensordot
. - Chained array operations, inefficient calculation order,
numpy.einsum_path
.
%%timeit
np.einsum('nxhd,nyhd->nhxy', a,b)
#8.03 µs ± 495 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%%timeit
np.sum(a[:,:,None,:,:]*b[:,None,:,:,:], axis=-1).transpose(0,3,1,2)
#13.7 µs ± 1.42 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
它表明 np.einsum
比单个步骤执行操作更快.
It shows that np.einsum
does the operation faster than individual steps.
这篇关于torch.einsum 如何执行这个 4D 张量乘法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!