标准化SVM的功能值 [英] Normalizing feature values for SVM

查看:111
本文介绍了标准化SVM的功能值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在研究一些SVM实现,我想知道-将特征值标准化到一个范围内的最佳方法是什么? (从0到1)

I've been playing with some SVM implementations and I am wondering - what is the best way to normalize feature values to fit into one range? (from 0 to 1)

假设我有3个特征,其值在以下范围内:

Let's suppose I have 3 features with values in ranges of:

  1. 3-5-

  1. 3 - 5.

0.02-0.05

0.02 - 0.05

10-15.

如何将所有这些值转换为[0,1]的范围?

How do I convert all of those values into range of [0,1]?

如果在训练过程中遇到的特征1的最大值是5,而在将模型用于更大的数据集后,我会偶然发现高达7的值怎么办?然后在转换后的范围内,它将超过1 ...

What If, during training, the highest value of feature number 1 that I will encounter is 5 and after I begin to use my model on much bigger datasets, I will stumble upon values as high as 7? Then in the converted range, it would exceed 1...

我如何在训练过程中对值进行归一化,以解决野外值"超过训练过程中模型看到"的最高(或最低)值的可能性?发生这种情况时,模型将如何反应?如何使其正常工作?

How do I normalize values during training to account for the possibility of "values in the wild" exceeding the highest(or lowest) values the model "seen" during training? How will the model react to that and how I make it work properly when that happens?

推荐答案

您可以通过将向量转换为单位向量来对其进行归一化.这会在要素的相对值而非幅度上训练SVM.归一化算法将适用于具有任何值的向量.

You normalise a vector by converting it to a unit vector. This trains the SVM on the relative values of the features, not the magnitudes. The normalisation algorithm will work on vectors with any values.

要转换为单位向量,请将每个值除以向量的长度 .例如,向量[4 0.02 12]的长度为12.6491.然后归一化向量为[4/12.6491 0.02/12.6491 12/12.6491] = [0.316 0.0016 0.949].

To convert to a unit vector, divide each value by the length of the vector. For example, a vector of [4 0.02 12] has a length of 12.6491. The normalised vector is then [4/12.6491 0.02/12.6491 12/12.6491] = [0.316 0.0016 0.949].

如果在野外",我们会遇到一个向量[400 2 1200],它将标准化为与上述相同的单位向量.归一化可以消除"特征的大小,我们剩下的相对值介于0和1之间.

If "in the wild" we encounter a vector of [400 2 1200] it will normalise to the same unit vector as above. The magnitudes of the features is "cancelled out" by the normalisation and we are left with relative values between 0 and 1.

这篇关于标准化SVM的功能值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆