keras BatchNormalization轴说明 [英] keras BatchNormalization axis clarification

查看：1498 发布时间：2020/4/25 10:07:50 python machine-learning deep-learning keras

本文介绍了keras BatchNormalization轴说明的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

keras BatchNormalization层使用axis=-1作为默认值并指出该功能轴通常进行归一化.为什么会这样?

The keras BatchNormalization layer uses axis=-1 as a default value and states that the feature axis is typically normalized. Why is this the case?

我认为这是令人惊讶的，因为我更熟悉使用

I suppose this is surprising because I'm more familiar with using something like StandardScaler, which would be equivalent to using axis=0. This would normalize the features individually.

是否有理由默认将样本(而非特征)分别在keras中单独归一化(即axis=-1)?

Is there a reason why samples are individually normalized by default (i.e. axis=-1) in keras as opposed to features?

具体示例

转换数据以使每个特征的均值和单位方差为零是很常见的.让我们考虑一下该模拟数据集的零均值"部分，其中每一行都是一个样本:

It's common to transform data such that each feature has zero mean and unit variance. Let's just consider the "zero mean" part with this mock dataset, where each row is a sample:

>>> data = np.array([[   1,   10,  100, 1000],
                     [   2,   20,  200, 2000],
                     [   3,   30,  300, 3000]])

>>> data.mean(axis=0)
array([    2.,    20.,   200.,  2000.])

>>> data.mean(axis=1)
array([ 277.75,  555.5 ,  833.25])

减去axis=0均值而不是axis=1均值是否更有意义?使用axis=1，单位和比例可以完全不同.

Wouldn't it make more sense to subtract the axis=0 mean, as opposed to the axis=1 mean? Using axis=1, the units and scales can be completely different.

本文中第3节的第一个等式似乎暗示axis=0应该是假设您有一个(m，n)形状的数据集，其中m是样本数，n是特征数，则用于分别计算每个特征的期望和方差.

The first equation of section 3 in this paper seems to imply that axis=0 should be used for calculating expectations and variances for each feature individually, assuming you have an (m, n) shaped dataset where m is the number of samples and n is the number of features.

另一个示例

我想查看BatchNormalization在玩具数据集上计算的均值和方差的维度:

I wanted to see the dimensions of the means and variances BatchNormalization was calculating on a toy dataset:

import pandas as pd
import numpy as np
from sklearn.datasets import load_iris

from keras.optimizers import Adam
from keras.models import Model
from keras.layers import BatchNormalization, Dense, Input


iris = load_iris()
X = iris.data
y = pd.get_dummies(iris.target).values

input_ = Input(shape=(4, ))
norm = BatchNormalization()(input_)
l1 = Dense(4, activation='relu')(norm)
output = Dense(3, activation='sigmoid')(l1)

model = Model(input_, output)
model.compile(Adam(0.01), 'categorical_crossentropy')
model.fit(X, y, epochs=100, batch_size=32)

bn = model.layers[1]
bn.moving_mean  # <tf.Variable 'batch_normalization_1/moving_mean:0' shape=(4,) dtype=float32_ref>

输入X的形状为(150，4)，BatchNormalization层的计算结果为4，表示它在axis=0上进行操作.

The input X has shape (150, 4), and the BatchNormalization layer calculated 4 means, which means it operated over axis=0.

如果BatchNormalization的默认值为axis=-1，那么应该不应该有150个均值吗?

If BatchNormalization has a default of axis=-1 then shouldn't there be 150 means?

keras BatchNormalization轴说明 [英] keras BatchNormalization axis clarification

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

keras BatchNormalization轴说明 [英] keras BatchNormalization axis clarification

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

登录关闭