从numpy矩阵中删除均值 [英] Remove mean from numpy matrix
问题描述
我有一个numpy矩阵A
,其中数据按列向量进行组织,即A[:,0]
是第一个数据向量,A[:,1]
是第二个数据向量,依此类推.我想知道是否有一种更优雅的方法可以将这些数据的均值清零.我目前正在通过for
循环进行此操作:
I have a numpy matrix A
where the data is organised column-vector-vise i.e A[:,0]
is the first data vector, A[:,1]
is the second and so on. I wanted to know whether there was a more elegant way to zero out the mean from this data. I am currently doing it via a for
loop:
mean=A.mean(axis=1)
for k in range(A.shape[1]):
A[:,k]=A[:,k]-mean
那么numpy是否提供了执行此操作的功能?还是可以通过另一种方式更有效地完成它?
So does numpy provide a function to do this? Or can it be done more efficiently another way?
推荐答案
通常,您可以通过多种方式执行此操作.下面的每种方法都可以通过在mean
向量上添加一个维,使其成为4 x 1数组来起作用,然后NumPy的广播将处理其余部分.每种方法都会创建mean
的视图,而不是深层副本.大多数人可能首选第一种方法(即使用newaxis
),但其他方法也包含在记录中.
As is typical, you can do this a number of ways. Each of the approaches below works by adding a dimension to the mean
vector, making it a 4 x 1 array, and then NumPy's broadcasting takes care of the rest. Each approach creates a view of mean
, rather than a deep copy. The first approach (i.e., using newaxis
) is likely preferred by most, but the other methods are included for the record.
除了以下方法外,另请参见 ovgolovin的答案,该方法使用NumPy矩阵来避免重塑mean
完全.
In addition to the approaches below, see also ovgolovin's answer, which uses a NumPy matrix to avoid the need to reshape mean
altogether.
对于以下方法,我们从以下代码和示例数组A
开始.
For the methods below, we start with the following code and example array A
.
import numpy as np
A = np.array([[1,2,3], [4,5,6], [7,8,9], [10, 11, 12]])
mean = A.mean(axis=1)
使用 numpy.newaxis
>>> A - mean[:, np.newaxis]
array([[-1., 0., 1.],
[-1., 0., 1.],
[-1., 0., 1.],
[-1., 0., 1.]])
使用None
文档指出newaxis
.这是因为
Using None
The documentation states that None
can be used instead of newaxis
. This is because
>>> np.newaxis is None
True
因此,以下完成任务.
>>> A - mean[:, None]
array([[-1., 0., 1.],
[-1., 0., 1.],
[-1., 0., 1.],
[-1., 0., 1.]])
也就是说,newaxis
更清晰,应该优先使用.同样,可以证明newaxis
更适合未来.另请参阅: numpy:我应该使用newaxis还是None? >
使用 ndarray.reshape
That said, newaxis
is clearer and should be preferred. Also, a case can be made that newaxis
is more future proof. See also: Numpy: Should I use newaxis or None?
>>> A - mean.reshape((mean.shape[0]), 1)
array([[-1., 0., 1.],
[-1., 0., 1.],
[-1., 0., 1.],
[-1., 0., 1.]])
直接更改 ndarray.shape
您也可以直接更改mean
的形状.
>>> mean.shape = (mean.shape[0], 1)
>>> A - mean
array([[-1., 0., 1.],
[-1., 0., 1.],
[-1., 0., 1.],
[-1., 0., 1.]])
这篇关于从numpy矩阵中删除均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!