哪种标准化方法(最小-最大)或z缩放(零平均单位方差)最适合深度学习? [英] Which Normalization method, min-max or z-scaling (Zero mean unit variance), works best for deep learning?

查看:127
本文介绍了哪种标准化方法(最小-最大)或z缩放(零平均单位方差)最适合深度学习?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有代表相对计数(0.0-1.0)的数据,如下面的示例所示.用公式

I have data that is representing relative counts (0.0-1.0) as presented in the example below. calculated with the formula

cell value(E.g.23)/sum of the colum(E.g. 1200)  = 0.01916

示例数据

 f1       f2         f3        f5        f6      f7      f8     class  
0.266    0.133     0.200     0.133    0.066    0.133    0.066     1 
0.250    0.130     0.080     0.160    0.002    0.300    0.111     0 
0.000    0.830     0.180     0.016    0.002    0.059    0.080     1
0.300    0.430     0.078     0.100    0.082    0.150    0.170     0

在应用深度学习算法之前,我删除了显示高相关性的功能.

before applying Deep learning algorithm I remove features that shows a high correlation.

归一化时我很困惑,哪种方法在生成模型之前是正确的.

I am confused at the time of normalization, which method is correct before model generation.

  1. 直接使用数据,因为数据已经缩放(0.0-1.0).
  2. 执行最小-最大缩放( https://scikit-learn.org/stable/modules/generation/sklearn.preprocessing.MinMaxScaler.html )
  3. 执行( https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html )

因为,当我使用经典监督算法时,最小-最大和z缩放可提高性能.但是,在使用"TensorFlow-GPU"进行深度学习的情况下,我看不到两者之间的任何显着差异.

Because, when I use classical supervised algorithms min-max and z-scaling improve performance. But in the case of Deep learning using "TensorFlow-GPU" I am not able to see any significant difference between the two.

谢谢.

推荐答案

当您的数据大致呈正态分布时,z缩放是一个好主意,

z-scaling is a good idea when your data is approximately normally distributed, this can often be the case.

当您期望大致均匀的分布时,最小比例缩放是正确的选择.

min-max scaling is the right thing to do when you expect a largely uniform distribution.

简而言之,这取决于您的数据和神经网络.

In short, it depends on your data and your neuronal network.

但是两者都对异常值敏感,您可以尝试进行疯狂中位数缩放.

But both are sensitive to outliers, you could try median-mad scaling.

另请参阅: https://stats.stackexchange.com/Questions/7757/神经网络中的数据标准化和标准化

这篇关于哪种标准化方法(最小-最大)或z缩放(零平均单位方差)最适合深度学习?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆