在python中标准化numpy数组列 [英] Normalize numpy array columns in python

查看:692
本文介绍了在python中标准化numpy数组列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个numpy数组,其中特定行的每个单元格代表一个要素的值.我将它们全部存储在100 * 4矩阵中.

I have a numpy array where each cell of a specific row represents a value for a feature. I store all of them in an 100*4 matrix.

A     B   C
1000  10  0.5
765   5   0.35
800   7   0.09  

有什么主意我可以如何规范这个numpy.array的行,其中每个值都在0到1之间?

Any idea how I can normalize rows of this numpy.array where each value is between 0 and 1?

我想要的输出是:

A     B    C
1     1    1
0.765 0.5  0.7
0.8   0.7  0.18(which is 0.09/0.5)

先谢谢您了:)

推荐答案

如果我理解正确,那么您要做的就是除以每一列中的最大值.您可以使用广播轻松做到这一点.

If I understand correctly, what you want to do is divide by the maximum value in each column. You can do this easily using broadcasting.

从示例数组开始:

import numpy as np

x = np.array([[1000,  10,   0.5],
              [ 765,   5,  0.35],
              [ 800,   7,  0.09]])

x_normed = x / x.max(axis=0)

print(x_normed)
# [[ 1.     1.     1.   ]
#  [ 0.765  0.5    0.7  ]
#  [ 0.8    0.7    0.18 ]]

x.max(0)在第0维(即行)上取最大值.这为您提供了一个大小为(ncols,)的向量,其中每一列均包含最大值.然后,您可以将x除以该向量,以标准化您的值,以使每列中的最大值将缩放为1.

x.max(0) takes the maximum over the 0th dimension (i.e. rows). This gives you a vector of size (ncols,) containing the maximum value in each column. You can then divide x by this vector in order to normalize your values such that the maximum value in each column will be scaled to 1.

如果x包含负值,则需要先减去最小值:

If x contains negative values you would need to subtract the minimum first:

x_normed = (x - x.min(0)) / x.ptp(0)

在这里,x.ptp(0)返回沿轴0的峰到峰"(即范围,最大值-最小值).此归一化还确保每列的最小值将为0.

Here, x.ptp(0) returns the "peak-to-peak" (i.e. the range, max - min) along axis 0. This normalization also guarantees that the minimum value in each column will be 0.

这篇关于在python中标准化numpy数组列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆