torch.einsum 如何执行这个 4D 张量乘法? [英] How does torch.einsum perform this 4D tensor multiplication?

查看:50
本文介绍了torch.einsum 如何执行这个 4D 张量乘法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了一个使用 torch.einsum 计算张量乘法的代码.我能够理解低阶张量的工作原理,但是对于4D张量如下:

I have come across a code which uses torch.einsum to compute a tensor multiplication. I am able to understand the workings for lower order tensors, but, not for the 4D tensor as below:

import torch

a = torch.rand((3, 5, 2, 10))
b = torch.rand((3, 4, 2, 10))

c = torch.einsum('nxhd,nyhd->nhxy', [a,b])

print(c.size())

# output: torch.Size([3, 2, 5, 4])

我需要以下方面的帮助:

I need help regarding:

  1. 这里执行的操作是什么(解释矩阵如何相乘/转置等)?
  2. torch.einsum 在这种情况下真的有用吗?
  1. What is the operation that has been performed here (explanation for how the matrices were multiplied/transposed etc.)?
  2. Is torch.einsum actually beneficial in this scenario?

推荐答案

(如果您只想了解 einsum 中涉及的步骤的细分,请跳到 tl;dr 部分)

我将尝试解释 einsum 如何在此示例中逐步工作,但我将使用 numpy,而不是使用 torch.einsum.einsum(documentation),这正是一样,但总的来说,我对它更满意.尽管如此,同样的步骤也适用于 Torch.

I'll try to explain how einsum works step by step for this example but instead of using torch.einsum, I'll be using numpy.einsum (documentation), which does exactly the same but I am just, in general, more comfortable with it. Nonetheless, the same steps happen for torch as well.

我们用NumPy重写上面的代码-

Let's rewrite the above code in NumPy -

import numpy as np

a = np.random.random((3, 5, 2, 10))
b = np.random.random((3, 4, 2, 10))
c = np.einsum('nxhd,nyhd->nhxy', a,b)
c.shape

#(3, 2, 5, 4)


一步一步的 np.einsum

Einsum 由 3 个步骤组成:multiplysumtranspose

让我们看看我们的尺寸.我们有一个 (3, 5, 2, 10) 和一个 (3, 4, 2, 10) 我们需要把它们带到 (3, 2, 5, 4) 基于 'nxhd,nyhd->nhxy'

Let's look at our dimensions. We have a (3, 5, 2, 10) and a (3, 4, 2, 10) that we need to bring to (3, 2, 5, 4) based on 'nxhd,nyhd->nhxy'

让我们不用担心 n,x,y,h,d 轴的顺序,只要担心是要保留它们还是删除(减少)它们.把它们写成一张表格,看看我们如何安排我们的维度 -

Let's not worry about the order in which the n,x,y,h,d axes is, and just worry about the fact if you want to keep them or remove (reduce) them. Writing them down as a table and see how we can arrange our dimensions -

        ## Multiply ##
       n   x   y   h   d
      --------------------
a  ->  3   5       2   10
b  ->  3       4   2   10
c1 ->  3   5   4   2   10

要获得 xy 轴之间的广播乘法以产生 (x, y),我们必须添加一个新的轴在正确的位置,然后相乘.

To get the broadcasting multiplication between x and y axis to result in (x, y), we will have to add a new axis at the right places and then multiply.

a1 = a[:,:,None,:,:] #(3, 5, 1, 2, 10)
b1 = b[:,None,:,:,:] #(3, 1, 4, 2, 10)

c1 = a1*b1
c1.shape

#(3, 5, 4, 2, 10)  #<-- (n, x, y, h, d)

2.求和/减少

接下来,我们要减少最后一个轴 10.这将得到维度 (n,x,y,h).

          ## Reduce ##
        n   x   y   h   d
       --------------------
c1  ->  3   5   4   2   10
c2  ->  3   5   4   2

这很简单.让我们在 axis=-1

c2 = np.sum(c1, axis=-1)
c2.shape

#(3,5,4,2)  #<-- (n, x, y, h)

3.转置

最后一步是使用转置重新排列轴.我们可以为此使用 np.transpose.np.transpose(0,3,1,2) 基本上是在第 0 轴之后带来第 3 轴并推动第 1 和第 2 轴.所以,(n,x,y,h) 变成了 (n,h,x,y)

3. Transpose

The last step is rearranging the axis using a transpose. We can use np.transpose for this. np.transpose(0,3,1,2) basically brings the 3rd axis after the 0th axis and pushes the 1st and 2nd. So, (n,x,y,h) becomes (n,h,x,y)

c3 = c2.transpose(0,3,1,2)
c3.shape

#(3,2,5,4)  #<-- (n, h, x, y)

4.最后检查

让我们做最后的检查,看看 c3 是否与从 np.einsum 生成的 c 相同 -

4. Final check

Let's do a final check and see if c3 is the same as the c which was generated from the np.einsum -

np.allclose(c,c3)

#True


TL;博士.

因此,我们已经实现了 'nxhd , nyhd ->nhxy' as -

input     -> nxhd, nyhd
multiply  -> nxyhd      #broadcasting
sum       -> nxyh       #reduce
transpose -> nhxy


优势

np.einsum 在所采取的多个步骤中的优势在于,您可以选择路径";它需要进行计算并使用相同的功能执行多个操作.这可以通过 optimize 参数来完成,这将优化 einsum 表达式的收缩顺序.


Advantage

Advantage of np.einsum over the multiple steps taken, is that you can choose the "path" that it takes to do the computation and perform multiple operations with the same function. This can be done by optimize paramter, which will optimize the contraction order of an einsum expression.

可以由 einsum 计算的这些操作的非详尽列表和示例如下所示:

A non-exhaustive list of these operations, which can be computed by einsum, is shown below along with examples:

  • 数组的跟踪,numpy.trace.
  • 返回对角线,numpy.diag.
  • 数组轴求和,numpy.sum.
  • 换位和排列,numpy.transpose.
  • 矩阵乘法和点积,numpy.matmul numpy.dot.
  • 向量内积和外积,numpy.inner numpy.outer.
  • 广播,元素和标量乘法,numpy.multiply.
  • 张量收缩,numpy.tensordot.
  • 链式数组操作,计算顺序低效,numpy.einsum_path.
  • Trace of an array, numpy.trace.
  • Return a diagonal, numpy.diag.
  • Array axis summations, numpy.sum.
  • Transpositions and permutations, numpy.transpose.
  • Matrix multiplication and dot product, numpy.matmul numpy.dot.
  • Vector inner and outer products, numpy.inner numpy.outer.
  • Broadcasting, element-wise and scalar multiplication, numpy.multiply.
  • Tensor contractions, numpy.tensordot.
  • Chained array operations, inefficient calculation order, numpy.einsum_path.
%%timeit
np.einsum('nxhd,nyhd->nhxy', a,b)
#8.03 µs ± 495 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%%timeit
np.sum(a[:,:,None,:,:]*b[:,None,:,:,:], axis=-1).transpose(0,3,1,2)
#13.7 µs ± 1.42 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

它表明 np.einsum 比单个步骤执行操作更快.

It shows that np.einsum does the operation faster than individual steps.

这篇关于torch.einsum 如何执行这个 4D 张量乘法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆