二维矩阵中各个列的最小-最大归一化 [英] Min-max normalization of individual columns in a 2D matrix
问题描述
我有一个包含4列/属性和150行的数据集.我想使用最小-最大规范化来规范化此数据.到目前为止,我的代码是:
I have a dataset which has 4 columns/attributes and 150 rows. I want to normalize this data using min-max normalization. So far, my code is:
minData=min(min(data1))
maxData=max(max(data1))
minmaxeddata=((data1-minData)./(maxData))
此处,minData
和maxData
返回全局最小值和最大值.因此,此代码实际上对2D矩阵中的所有值应用了最小-最大值归一化,因此全局最小值为0,全局最大值为1.
Here, minData
and maxData
returns the global minimum and maximum values. Therefore, this code actually applies a min-max normalization over all values in the 2D matrix so that the global minimum is 0 and the global maximum is 1.
但是,我想对每个列分别执行相同的操作.具体来说,应该独立于其他列对2D矩阵的每一列进行min-max归一化.
However, I would like to perform the same operation on each column individually. Specifically, each column of the 2D matrix should be min-max normalized independently from the other columns.
我尝试仅使用min(data1)
和max(data1)
,但收到错误消息,提示矩阵尺寸必须一致.
I tried using just using min(data1)
and max(data1)
, but got the error saying that the Matrix dimensions must agree.
但是,通过使用全局最小值和最大值,我得到了[0-1]
范围内的值,并使用此归一化数据集进行了实验.我想知道我的结果是否有问题?我的理解也存在问题吗?任何指导将不胜感激.
However, by using the global minimum and maximum, I got the values in the range of [0-1]
and have done experimentations using this normalized dataset. I would like to know whether there is any problem in my results? Is there a problem in my understanding as well? Any guidance would be appreciated.
推荐答案
如果我对您的理解正确,则希望对data1
的每一列进行规范化.另外,由于每一列都是独立数据集,并且很可能具有不同的动态范围,因此不建议执行全局的min-max操作.我建议您按照最初的想法,分别对每一列进行标准化.
If I understand you correctly, you wish to normalize each column of data1
. Also, as each column is an independent data set and most likely having different dynamic ranges, doing a global min-max operation is probably not recommended. I would recommend that you go with your initial thoughts in normalizing each column individually.
随着错误的发生,您不能用min(data1)
减去data1
,因为当data1
是矩阵时,min(data1)
会产生行向量.您正在用向量减去一个矩阵,这就是为什么会出现该错误的原因.
Going with your error, you can't subtract data1
with min(data1)
because min(data1)
would produce a row vector while data1
is a matrix. You are subtracting a matrix with a vector which is why you are getting that error.
如果您想实现自己的要求,请使用 bsxfun
广播矢量,并重复data1
一样多的行.因此:
If you want to achieve what you're asking, use bsxfun
to broadcast the vector and repeat it for as many rows as you have data1
. Therefore:
mindata = min(data1);
maxdata = max(data1);
minmaxdata = bsxfun(@rdivide, bsxfun(@minus, data1, mindata), maxdata - mindata);
示例
>> data1 = [5 9 9 9 3 3; 3 10 2 1 10 1; 2 4 4 6 5 5]
data1 =
5 9 9 9 3 3
3 10 2 1 10 1
2 4 4 6 5 5
运行上面的规范化代码时,我得到:
When I run the above normalization code, I get:
minmaxdata =
1.0000 0.8333 1.0000 1.0000 0 0.5000
0.3333 1.0000 0 0 1.0000 0
0 0 0.2857 0.6250 0.2857 1.0000
这篇关于二维矩阵中各个列的最小-最大归一化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!