使用numpy张量点进行张量乘法 [英] Tensor multiplication with numpy tensordot

查看:851
本文介绍了使用numpy张量点进行张量乘法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个张量U,它由n个维数为(d,k)的矩阵和一个V维数为(k,n)的矩阵组成.

我想将它们相乘,以便结果返回尺寸为(d,n)的矩阵,其中第j列是U的矩阵j与V的列j之间的矩阵相乘的结果.

获得此结果的一种可能方法是:

for j in range(n):
    res[:,j] = U[:,:,j] * V[:,j]

我想知道使用numpy库是否有更快的方法.特别是,我在考虑 np.tensordot() 函数.

这个小片段允许我将单个矩阵乘以标量,但是对向量的明显概括并没有返回我想要的结果.

a = np.array(range(1, 17))
a.shape = (4,4)
b = np.array((1,2,3,4,5,6,7))
r1 = np.tensordot(b,a, axes=0)

有什么建议吗?

解决方案

有两种方法可以做到这一点.首先想到的是 np.einsum :

# some fake data
gen = np.random.RandomState(0)
ni, nj, nk = 10, 20, 100
U = gen.randn(ni, nj, nk)
V = gen.randn(nj, nk)

res1 = np.zeros((ni, nk))
for k in range(nk):
    res1[:,k] = U[:,:,k].dot(V[:,k])

res2 = np.einsum('ijk,jk->ik', U, V)

print(np.allclose(res1, res2))
# True

np.einsum使用爱因斯坦符号来表达张量收缩.在上面的表达式'ijk,jk->ik'中,ijk是与UV的不同尺寸相对应的下标.每个逗号分隔的分组对应于传递给np.einsum的操作数之一(在这种情况下,U的尺寸为ijk,而V的尺寸为jk). '->ik'部分指定输出数组的尺寸.将对输出字符串中不存在的带有下标的所有尺寸求和.

np.einsum对于执行复杂的张量收缩非常有用,但是可能需要一段时间才能完全了解其工作原理.您应该看一下文档中的示例(上面链接).


其他一些选择:

  1. 广播,然后进行求和:

    res3 = (U * V[None, ...]).sum(1)
    

  2. inner1d,具有移调的负载:

    from numpy.core.umath_tests import inner1d
    
    res4 = inner1d(U.transpose(0, 2, 1), V.T)
    

一些基准:

In [1]: ni, nj, nk = 100, 200, 1000

In [2]: %%timeit U = gen.randn(ni, nj, nk); V = gen.randn(nj, nk)
   ....: np.einsum('ijk,jk->ik', U, V)
   ....: 
10 loops, best of 3: 23.4 ms per loop

In [3]: %%timeit U = gen.randn(ni, nj, nk); V = gen.randn(nj, nk)
(U * V[None, ...]).sum(1)
   ....: 
10 loops, best of 3: 59.7 ms per loop

In [4]: %%timeit U = gen.randn(ni, nj, nk); V = gen.randn(nj, nk)
inner1d(U.transpose(0, 2, 1), V.T)
   ....: 
10 loops, best of 3: 45.9 ms per loop

I have a tensor U composed of n matrices of dimension (d,k) and a matrix V of dimension (k,n).

I would like to multiply them so that the result returns a matrix of dimension (d,n) in which column j is the result of the matrix multiplication between the matrix j of U and the column j of V.

One possible way to obtain this is:

for j in range(n):
    res[:,j] = U[:,:,j] * V[:,j]

I am wondering if there is a faster approach using numpy library. In particular I'm thinking of the np.tensordot() function.

This small snippet allows me to multiply a single matrix by a scalar, but the obvious generalization to a vector is not returning what I was hoping for.

a = np.array(range(1, 17))
a.shape = (4,4)
b = np.array((1,2,3,4,5,6,7))
r1 = np.tensordot(b,a, axes=0)

Any suggestion?

解决方案

There are a couple of ways you could do this. The first thing that comes to mind is np.einsum:

# some fake data
gen = np.random.RandomState(0)
ni, nj, nk = 10, 20, 100
U = gen.randn(ni, nj, nk)
V = gen.randn(nj, nk)

res1 = np.zeros((ni, nk))
for k in range(nk):
    res1[:,k] = U[:,:,k].dot(V[:,k])

res2 = np.einsum('ijk,jk->ik', U, V)

print(np.allclose(res1, res2))
# True

np.einsum uses Einstein notation to express tensor contractions. In the expression 'ijk,jk->ik' above, i,j and k are subscripts that correspond to the different dimensions of U and V. Each comma-separated grouping corresponds to one of the operands passed to np.einsum (in this case U has dimensions ijk and V has dimensions jk). The '->ik' part specifies the dimensions of the output array. Any dimensions with subscripts that aren't present in the output string are summed over.

np.einsum is incredibly useful for performing complex tensor contractions, but it can take a while to fully wrap your head around how it works. You should take a look at the examples in the documentation (linked above).


Some other options:

  1. Element-wise multiplication with broadcasting, followed by summation:

    res3 = (U * V[None, ...]).sum(1)
    

  2. inner1d with a load of transposing:

    from numpy.core.umath_tests import inner1d
    
    res4 = inner1d(U.transpose(0, 2, 1), V.T)
    

Some benchmarks:

In [1]: ni, nj, nk = 100, 200, 1000

In [2]: %%timeit U = gen.randn(ni, nj, nk); V = gen.randn(nj, nk)
   ....: np.einsum('ijk,jk->ik', U, V)
   ....: 
10 loops, best of 3: 23.4 ms per loop

In [3]: %%timeit U = gen.randn(ni, nj, nk); V = gen.randn(nj, nk)
(U * V[None, ...]).sum(1)
   ....: 
10 loops, best of 3: 59.7 ms per loop

In [4]: %%timeit U = gen.randn(ni, nj, nk); V = gen.randn(nj, nk)
inner1d(U.transpose(0, 2, 1), V.T)
   ....: 
10 loops, best of 3: 45.9 ms per loop

这篇关于使用numpy张量点进行张量乘法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆