在python中标准化numpy数组列 [英] Normalize numpy array columns in python
问题描述
我有一个numpy数组,其中特定行的每个单元格代表一个要素的值.我将它们全部存储在100 * 4矩阵中.
I have a numpy array where each cell of a specific row represents a value for a feature. I store all of them in an 100*4 matrix.
A B C
1000 10 0.5
765 5 0.35
800 7 0.09
有什么主意我可以如何规范这个numpy.array的行,其中每个值都在0到1之间?
Any idea how I can normalize rows of this numpy.array where each value is between 0 and 1?
我想要的输出是:
A B C
1 1 1
0.765 0.5 0.7
0.8 0.7 0.18(which is 0.09/0.5)
先谢谢您了:)
推荐答案
如果我理解正确,那么您要做的就是除以每一列中的最大值.您可以使用广播轻松做到这一点.
If I understand correctly, what you want to do is divide by the maximum value in each column. You can do this easily using broadcasting.
从示例数组开始:
import numpy as np
x = np.array([[1000, 10, 0.5],
[ 765, 5, 0.35],
[ 800, 7, 0.09]])
x_normed = x / x.max(axis=0)
print(x_normed)
# [[ 1. 1. 1. ]
# [ 0.765 0.5 0.7 ]
# [ 0.8 0.7 0.18 ]]
x.max(0)
在第0维(即行)上取最大值.这为您提供了一个大小为(ncols,)
的向量,其中每一列均包含最大值.然后,您可以将x
除以该向量,以标准化您的值,以使每列中的最大值将缩放为1.
x.max(0)
takes the maximum over the 0th dimension (i.e. rows). This gives you a vector of size (ncols,)
containing the maximum value in each column. You can then divide x
by this vector in order to normalize your values such that the maximum value in each column will be scaled to 1.
如果x
包含负值,则需要先减去最小值:
If x
contains negative values you would need to subtract the minimum first:
x_normed = (x - x.min(0)) / x.ptp(0)
在这里,x.ptp(0)
返回沿轴0的峰到峰"(即范围,最大值-最小值).此归一化还确保每列的最小值将为0.
Here, x.ptp(0)
returns the "peak-to-peak" (i.e. the range, max - min) along axis 0. This normalization also guarantees that the minimum value in each column will be 0.
这篇关于在python中标准化numpy数组列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!