numpy矩阵代数最佳实践 [英] numpy matrix algebra best practice

查看:73
本文介绍了numpy矩阵代数最佳实践的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的问题是关于下面的最后一行:mu@sigma@mu.为什么行得通?一维ndarray是否被视为行向量或列向量?无论哪种方式,它都不应该是mu.T@sigma@mumu@sigma@mu.T吗?我知道mu.T仍会返回mu,因为mu仅具有一维,但解释器似乎仍然很聪明.

>> import numpy as np
>> mu = np.array([1, 1])
>> print(mu)

[1 1]

>> sigma = np.eye(2) * 3
>> print(sigma)

[[ 3.  0.]
 [ 0.  3.]]

>> mu@sigma@mu

6.0

更一般地说,这是python中矩阵代数的一种更好的做法:使用ndarray和@如上进行矩阵乘法(更简洁的代码),或者使用np.matrix和重载的*如下(在数学上更少令人困惑)

>> import numpy as np
>> mu = np.matrix(np.array([1, 1]))
>> print(mu)

[[1 1]]

>> sigma = np.matrix(np.eye(2) * 3)
>> print(sigma)

[[ 3.  0.]
 [ 0.  3.]]

>> a = mu * sigma * mu.T
>> a.item((0, 0))

6.0

解决方案

Python从左到右执行链式操作:

In [32]: mu=np.array([1,1])
In [33]: sigma= np.array([[3,0],[0,3]])
In [34]: mu@sigma@mu
Out[34]: 6

与执行两个表达式相同:

In [35]: temp=mu@sigma
In [36]: temp.shape
Out[36]: (2,)
In [37]: temp@mu
Out[37]: 6

在我的评论(已删除)中,我声称@只是在做np.dot.那不是很正确.该文档对1d数组的处理方式有所不同.但是最终的形状是相同的:

In [38]: mu.dot(sigma).dot(mu)
Out[38]: 6
In [39]: mu.dot(sigma).shape
Out[39]: (2,)

对于1d和2d数组,np.dot@应该产生相同的结果.它们在处理高维数组方面有所不同.

从历史上看,numpy使用过数组,可以是0d,1d及以上. np.dot是原始的矩阵乘法方法/函数.

添加了

np.matrix,主要是为那些任性的MATLAB程序员提供了便利.它仅允许2d数组(就像1990年代的旧MATLAB一样).并且它用

In [278]: np.sum(A*B)
Out[278]: 13

重载__mat__(*)

def __mul__(self, other):
    if isinstance(other, (N.ndarray, list, tuple)) :
        # This promotes 1-D vectors to row vectors
        return N.dot(self, asmatrix(other))
    if isscalar(other) or not hasattr(other, '__rmul__') :
        return N.dot(self, other)
    return NotImplemented

Mu*sigmaMu@sigma的行为相同,尽管调用树不同

In [48]: Mu@sigma@Mu
...
ValueError: shapes (1,2) and (1,2) not aligned: 2 (dim 1) != 1 (dim 0)

Mu*sigma生成一个(1,2)矩阵,该矩阵无法将(1,2)矩阵相乘,因此需要转置:

In [49]: Mu@sigma@Mu.T
Out[49]: matrix([[6]])

请注意,这是一个(1,1)矩阵.如果需要标量,必须使用item. (在MATLAB中,没有标量.所有东西都具有形状/大小.)

@是Python和numpy的较新版本.它作为未实现的运算符添加到Python中. numpy(可能还有其他软件包)已经实现了它.

这使链接表达式成为可能,尽管我对[38]中的链接dot没有任何问题.在处理高维案例时,它会更加有用.

此添加意味着使用旧的np.matrix类的原因减少了. (在scipy.sparse矩阵类中,类似矩阵的行为更加根深蒂固.)

如果您想要数学纯正",我建议您采用数学物理方法,并使用爱因斯坦表示法-在np.einsum中实现.


使用如此小的数组,计时反映的是调用结构,而不是实际的计算数量:

In [57]: timeit mu.dot(sigma).dot(mu)
2.79 µs ± 7.75 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [58]: timeit mu@sigma@mu
6.29 µs ± 31.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [59]: timeit Mu@sigma@Mu.T
17.1 µs ± 134 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [60]: timeit Mu*sigma*Mu.T
17.7 µs ± 517 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

请注意,老式" dot最快,而两个矩阵版本都较慢.

在转置上

我可能会补充说,在像mu@sigma@mu.T这样的表达式中,.T首先被求值,因此@(matmul)不知道有尝试转置mu的尝试.它仅使用转置的结果.请记住,解析表达式的是Python解释器,而不是numpy函数.因此,如果mu.Tmu不执行任何操作(与一维数组一样),则.T@中也将无效.

transpose对于2d数组定义良好. numpy转置的编写方式使其也可以与任何维数的数组一起使用.它更改轴的顺序,但从不添加尺寸.还有其他工具可以执行此操作,例如reshapenp.newaxis索引.

在八度音阶中,转置仅针对二维对象定义

>> ones(2,3,4)'
error: transpose not defined for N-d objects

1-d对象在MATLAB中不存在.

In [270]: np.ones((2,3,4),int).T.shape
Out[270]: (4, 3, 2)

内部和外部产品

np.dot对于如何处理1d数组向量"非常明确:

  • 如果ab均为一维数组,则它是向量的内积 (没有复杂的共轭).

matmul中,描述更加复杂,但是结果是相同的.

关于您的评论:

A = [1,2]并且B = [3,5]是(2,)ndarray,A @ B可能表示[1,2] * [3,5]'= 13,或者可能意味着[ 1,2]'* [3,5] = [[3,5],[6,10]]

numpy中获取产品的各种方式是:

In [273]: A = np.array([1,2]); B = np.array([3,5])

元素乘法(在MATLAB中为.*)

In [274]: A*B
Out[274]: array([ 3, 10])

内部产品

In [275]: A@B       # same as np.dot(A,B)
Out[275]: 13

外部产品

In [276]: np.outer(A,B)
Out[276]: 
array([[ 3,  5],
       [ 6, 10]])

获取内积的另一种方法:

In [278]: np.sum(A*B)
Out[278]: 13

使用爱因斯坦表示法(来自数学物理学),包括内部和外部:

In [280]: np.einsum('i,i',A,B)
Out[280]: array(13)
In [281]: np.einsum('i,j',A,B)
Out[281]: 
array([[ 3,  5],
       [ 6, 10]])

使用广播获取外部

In [282]: A[:,None]*B[None,:]
Out[282]: 
array([[ 3,  5],
       [ 6, 10]])

[1, 2]'的目标在numpy中实现为A[:,None],将1d数组转换为列向量(2d).

Octave在numpy拥有广播很久之后就添加了广播.我不知道MATLAB是否已经取得了如此大的进步. :)

@不进行广播,但可以使用列或行向量:

In [283]: A@B[:,None]       # your imagined A*B'
Out[283]: array([13])

要使用@获取外部,我需要添加尺寸以使A为列向量,而B为行向量:

In [284]: A[:,None]@B
ValueError: shapes (2,1) and (2,) not aligned: 1 (dim 1) != 2 (dim 0)
In [285]: A[:,None]@B[None,:]
Out[285]: 
array([[ 3,  5],
       [ 6, 10]])

当数组实际上是二维的,并且包括一维为1的情况时,@/dot行为与普通矩阵约定没有什么不同.当维数为1或大于2时,来自MATLAB的直觉会失败,部分原因是MATLAB无法处理这些直觉.

此八度音阶错误表明n维矩阵仅背对2维矩阵.在numpy意义上,它们不是真正的n-d.

>> ones(2,3,4) * ones(2,3,4)
error: operator *: nonconformant arguments (op1 is 2x12, op2 is 2x12)

My question is regarding the last line below: mu@sigma@mu. Why does it work? Is a one-dimensional ndarray treated as a row vector or a column vector? Either way, shouldn't it be mu.T@sigma@mu or mu@sigma@mu.T? I know mu.T still returns mu since mu only has one dimension, but still, the interpreter seems to be too smart.

>> import numpy as np
>> mu = np.array([1, 1])
>> print(mu)

[1 1]

>> sigma = np.eye(2) * 3
>> print(sigma)

[[ 3.  0.]
 [ 0.  3.]]

>> mu@sigma@mu

6.0

More generally, which is a better practice for matrix algebra in python: use ndarray and @ to do matrix multiplication as above (cleaner code), or use np.matrix and the overloaded * as below (mathematically less confusing)

>> import numpy as np
>> mu = np.matrix(np.array([1, 1]))
>> print(mu)

[[1 1]]

>> sigma = np.matrix(np.eye(2) * 3)
>> print(sigma)

[[ 3.  0.]
 [ 0.  3.]]

>> a = mu * sigma * mu.T
>> a.item((0, 0))

6.0

解决方案

Python performs chained operations left to right:

In [32]: mu=np.array([1,1])
In [33]: sigma= np.array([[3,0],[0,3]])
In [34]: mu@sigma@mu
Out[34]: 6

is the same as doing two expressions:

In [35]: temp=mu@sigma
In [36]: temp.shape
Out[36]: (2,)
In [37]: temp@mu
Out[37]: 6

In my comments (deleted) I claimed @ was just doing np.dot. That's not quite right. The documentation describes the handling of 1d arrays differently. But the resulting shapes are the same:

In [38]: mu.dot(sigma).dot(mu)
Out[38]: 6
In [39]: mu.dot(sigma).shape
Out[39]: (2,)

For 1d and 2d arrays, np.dot and @ should produce the same result. They differ in handling higher dimensional arrays.

Historically numpy has used arrays, which can be 0d, 1d, and on up. np.dot was the original matrix multiplication method/function.

np.matrix was added, largely as a convenience for wayward MATLAB programmers. It only allows 2d arrays (just as the old, 1990s MATLAB). And it overloads __mat__ (*) with

def __mul__(self, other):
    if isinstance(other, (N.ndarray, list, tuple)) :
        # This promotes 1-D vectors to row vectors
        return N.dot(self, asmatrix(other))
    if isscalar(other) or not hasattr(other, '__rmul__') :
        return N.dot(self, other)
    return NotImplemented

Mu*sigma and Mu@sigma behave the same, though the calling tree is different

In [48]: Mu@sigma@Mu
...
ValueError: shapes (1,2) and (1,2) not aligned: 2 (dim 1) != 1 (dim 0)

Mu*sigma produces a (1,2) matrix, which cannot matrix multiply a (1,2), hence the need for a transpose:

In [49]: Mu@sigma@Mu.T
Out[49]: matrix([[6]])

Note that this is a (1,1) matrix. You have to use item if you want a scalar. (In MATLAB there isn't such a thing as a scalar. Everything has a shape/size.)

@ is a relatively recent addition to Python and numpy. It was added to Python as an unimplemented operator. numpy (and possibly other packages) has implemented it.

It makes chained expressions possible, though I don't have any problems with the chained dot in [38]. It is more useful when handling higher dimensional cases.

This addition means there is one less reason to use the old np.matrix class. (Matrix like behavior is more deeply ingrained in the scipy.sparse matrix classes.)

If you want 'mathematical purity' I'd suggest taking the mathematical physics approach, and use Einstein notation - as implemented in np.einsum.


With arrays this small, the timings reflect the calling structure more than the actually number of calculations:

In [57]: timeit mu.dot(sigma).dot(mu)
2.79 µs ± 7.75 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [58]: timeit mu@sigma@mu
6.29 µs ± 31.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [59]: timeit Mu@sigma@Mu.T
17.1 µs ± 134 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [60]: timeit Mu*sigma*Mu.T
17.7 µs ± 517 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Note that the 'old-fashioned' dot is fastest, while both matrix versions are slower.

On the transpose

I might add that in an expression like mu@sigma@mu.T, the .T is evaluated first, so the @ (matmul) does not know that there was an attempt to transpose mu. It just uses the result of the transpose. Remember it's the Python interpreter that parses the expression, not the numpy functions. So if mu.T does nothing to mu, as is the case with 1d arrays, the .T will have no effect in the @ either.

transpose is well defined for 2d arrays. The numpy transpose has been written in such a way that it also works with arrays of any dimension. It changes the order of the axes, but never adds a dimension. There are other tools for doing that, such as reshape and the np.newaxis indexing.

In Octave, transpose is defined only for 2-d objects

>> ones(2,3,4)'
error: transpose not defined for N-d objects

1-d objects don't exist in MATLAB.

In [270]: np.ones((2,3,4),int).T.shape
Out[270]: (4, 3, 2)

Inner and Outer products

np.dot is quite explicit about how it handles 1d arrays, 'vectors':

  • If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).

In matmul the description is more convoluted, but the result is the same.

Regarding your comment:

A = [1, 2] and B = [3, 5] being (2, ) ndarray, A@B could mean [1, 2] * [3, 5]' = 13, or it could mean [1, 2]' * [3, 5] = [[3, 5], [6, 10]]

The various ways of taking a product in numpy are:

In [273]: A = np.array([1,2]); B = np.array([3,5])

Element multiplication (.* in MATLAB)

In [274]: A*B
Out[274]: array([ 3, 10])

Inner product

In [275]: A@B       # same as np.dot(A,B)
Out[275]: 13

Outer product

In [276]: np.outer(A,B)
Out[276]: 
array([[ 3,  5],
       [ 6, 10]])

Another way to get the inner product:

In [278]: np.sum(A*B)
Out[278]: 13

With Einstein notation (from math physics), both inner and outer:

In [280]: np.einsum('i,i',A,B)
Out[280]: array(13)
In [281]: np.einsum('i,j',A,B)
Out[281]: 
array([[ 3,  5],
       [ 6, 10]])

Using broadcasting to get the outer

In [282]: A[:,None]*B[None,:]
Out[282]: 
array([[ 3,  5],
       [ 6, 10]])

What you intend by [1, 2]' is implemented in numpy as A[:,None], turning the 1d array into a column vector (2d).

Octave added broadcasting long after numpy had it. I don't know if MATLAB has advanced that far or not. :)

@ doesn't do broadcasting, but can work with the column or row vectors:

In [283]: A@B[:,None]       # your imagined A*B'
Out[283]: array([13])

To get the outer with @ I need to add a dimension to make A a column vector, and B a row vector:

In [284]: A[:,None]@B
ValueError: shapes (2,1) and (2,) not aligned: 1 (dim 1) != 2 (dim 0)
In [285]: A[:,None]@B[None,:]
Out[285]: 
array([[ 3,  5],
       [ 6, 10]])

When the arrays really are 2d, and that includes the case where one dimension is 1, the @/dot behavior isn't that different from common matrix conventions. It's when dimension is 1, or greater than 2, that intuitions from MATLAB fail, in part because MATLAB doesn't handle those.

This Octave error suggests that n-d matrices just piggy back on 2d ones. They aren't real n-d in the numpy sense.

>> ones(2,3,4) * ones(2,3,4)
error: operator *: nonconformant arguments (op1 is 2x12, op2 is 2x12)

这篇关于numpy矩阵代数最佳实践的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆