这些模型是等价的吗? [英] Are these models equivalent?

查看:23
本文介绍了这些模型是等价的吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

主要问题:我以两种不同的方式定义同一个模型.为什么我得到不同的结果?他们似乎是同一个模型.

Main question: I define the same model in two different ways. Why do I get different results? They seem to be the same model.

第二个问题(在下面回答)如果我再次运行代码,我再次得到不同的结果.我在开始时设置了种子以修复随机性.为什么会这样?

Secondary question (answered below) If I run the code again, I get different results again. I have set the seed at the beginning to fix the randomness. Why is that happening?

import numpy as np
np.random.seed(1)
from keras.models import Model, Sequential
from keras.layers import Input, Dense

model1= Sequential([
     Dense(20, activation='sigmoid',kernel_initializer='glorot_normal', 
               input_shape=(2,)),
     Dense(2,  activation='linear', kernel_initializer='glorot_normal'),
])

model1.compile(optimizer='adam', loss='mean_squared_error')

ipt    = Input(shape=(2,))
x      = Dense(20, activation='sigmoid', kernel_initializer='glorot_normal')(ipt)
out    = Dense(2,  activation='linear',  kernel_initializer='glorot_normal')(x)
model2 = Model(ipt, out)

model2.compile(optimizer='adam', loss='mean_squared_error')

x_train=np.array([[1,2],[3,4],[3,4]])

model1.fit(x_train, x_train,epochs=2, validation_split=0.1, shuffle=False)
model2.fit(x_train, x_train,epochs=2, validation_split=0.1, shuffle=False)

第一次,输出为:

2/2 [==============================] - 0s 68ms/step - loss: 14.4394 - val_loss: 21.5747
Epoch 2/2

2/2 [==============================] - 0s 502us/step - loss: 14.3199 - val_loss: 21.4163
Train on 2 samples, validate on 1 samples
Epoch 1/2

2/2 [==============================] - 0s 72ms/step - loss: 11.0523 - val_loss: 17.7059
Epoch 2/2

2/2 [==============================] - 0s 491us/step - loss: 10.9833 - val_loss: 17.5785

第二次,输出为:

2/2 [==============================] - 0s 80ms/step - loss: 14.4394 - val_loss: 21.5747
Epoch 2/2

2/2 [==============================] - 0s 501us/step - loss: 14.3199 - val_loss: 21.4163
Train on 2 samples, validate on 1 samples
Epoch 1/2

2/2 [==============================] - 0s 72ms/step - loss: 11.0523 - val_loss: 17.6733
Epoch 2/2

2/2 [==============================] - 0s 485us/step - loss: 10.9597 - val_loss: 17.5459

<小时>

阅读答案后更新:通过下面的答案,我的一个问题得到了解答.我将代码的开头更改为:


Update after reading the answer: By the answer below, one of my questions has been answered. I changed the beginning of my code to:

import numpy as np
np.random.seed(1)
import random
random.seed(2)
import tensorflow as tf
tf.set_random_seed(3)

而且,现在我得到的数字与以前相同.所以,它是稳定的.但是,我的主要问题仍未得到解答.为什么每次,两个等效模型给出不同的结果?

And, now I am getting the same numbers as before. So, it is stable. But, my main question has remained unanswered. Why at each time, the two equivalent models give different results?

这是我每次得到的结果:

Here is the result I get every time:

结果 1:

Epoch 1/2

2/2 [==============================] - 0s 66ms/sample - loss: 11.9794 - val_loss: 18.9925
Epoch 2/2

2/2 [==============================] - 0s 268us/sample - loss: 11.8813 - val_loss: 18.8572

结果 2:

Epoch 1/2

2/2 [==============================] - 0s 67ms/sample - loss: 5.4743 - val_loss: 9.3471
Epoch 2/2

2/2 [==============================] - 0s 3ms/sample - loss: 5.4108 - val_loss: 9.2497

推荐答案

问题的根源在于模型定义和随机性的预期与实际行为.要了解发生了什么,我们必须了解RNG"的工作原理:

The problem's rooted in the expected vs. actual behavior of model definition and randomness. To see what's going on, we must understand how "RNG" works:

  • 随机数生成器"(RNG) 实际上是一种生成数字的函数,以便将它们映射到从长远来看"的概率分布上
  • 当 RNG 函数时,例如RNG() 被调用,它返回一个随机"值并且将其内部计数器增加 1.调用这个计数器 n - 然后:random_value = RNG(n)
  • 当你设置一个 SEED 时,你根据那个种子的值来设置 n(但不是 那个种子);我们可以通过计数器中的 + c 来表示这种差异
  • c 将是由种子的非线性但确定性函数产生的常数:f(seed)
  • A "random number generator" (RNG) is actually a function that produces numbers such that they map onto a probability distribution 'in the long run'
  • When the RNG function, e.g. RNG() is called, it returns a "random" value and increments its internal counter by 1. Call this counter n - then: random_value = RNG(n)
  • When you set a SEED, you set n according to the value of that seed (but not to that seed); we can represent this difference via + c in the counter
  • c will be a constant produced by a non-linear, but deterministic, function of the seed: f(seed)
import numpy as np

np.random.seed(4)         # internal counter = 0 + c
print(np.random.random()) # internal counter = 1 + c
print(np.random.random()) # internal counter = 2 + c
print(np.random.random()) # internal counter = 3 + c

np.random.seed(4)         # internal counter = 0 + c
print(np.random.random()) # internal counter = 1 + c
print(np.random.random()) # internal counter = 2 + c
print(np.random.random()) # internal counter = 3 + c

0.9670298390136767
0.5472322491757223
0.9726843599648843

0.9670298390136767
0.5472322491757223
0.9726843599648843

假设 model1 有 100 个权重,并且您设置了一个种子 (n = 0 + c).model1 构建完成后,您的计数器位于 100 + c.如果您重置种子,即使您使用完全相同的代码构建model2,模型也会有所不同 - 如model2 的权重按 n 初始化,从 100 + c200 + c.

Suppose model1 has 100 weights, and you set a seed (n = 0 + c). After model1 is built, your counter is at 100 + c. If you don't reset the seed, even if you build model2 with the exact same code, the models will differ - as model2's weights are initialized per n from 100 + c to 200 + c.

<小时>附加信息:

三个种子以确保更好的随机性:

There are three seeds to ensure better randomness:

import numpy as np
np.random.seed(1)         # for Numpy ops
import random 
random.seed(2)            # for Python ops
import tensorflow as tf
tf.set_random_seed(3)     # for tensorfow ops - e.g. Dropout masks

这将提供非常好的可重复性,但如果您使用 GPU,则不完美 - 由于操作的并行性;这个视频 很好地解释了这一点.为了获得更好的重现性,请在官方 Keras 常见问题.

This'll give pretty good reproducibility, but not perfect if you're using a GPU - due to parallelism of operations; this video explains it well. For even better reproducibility, set your PYHTONHASHSEED - that and other info in the official Keras FAQ.

完美"的可重复性是多余的,因为大多数情况下您的结果应该在 0.1% 内一致 - 但如果您真的需要它,目前唯一的方法可能是切换到 CPU 并停止使用 CUDA - 但那'会大大减慢训练速度(x10+).

"Perfect" reproducibility is rather redundant, as your results should agree within .1% majority of the time - but if you really need it, likely the only way currently is to switch to CPU and stop using CUDA - but that'll slow down training tremendously (by x10+).

随机性来源:

  • 权重初始化(每个默认的 Keras 初始化器都使用随机性)
  • 噪声层(Dropout、GaussianNoise 等)
  • 哈希,用于基于哈希的操作,例如集合或字典中的项目顺序
  • GPU 并行性(参见链接视频)
  • Weight initializations (every default Keras initializer uses randomness)
  • Noise layers (Dropout, GaussianNoise, etc)
  • Hashing for hash-based operations, e.g. item order in a set or dict
  • GPU parallelism (see linked video)

模型随机性演示:

import numpy as np
np.random.seed(4)

model1_init_weights = [np.random.random(), np.random.random(), np.random.random()]
model2_init_weights = [np.random.random(), np.random.random(), np.random.random()]
print("model1_init_weights:", model1_init_weights)
print("model2_init_weights:", model2_init_weights)

model1_init_weights: [0.9670298390136767, 0.5472322491757223, 0.9726843599648843]
model2_init_weights: [0.7148159936743647, 0.6977288245972708, 0.21608949558037638]

重启内核.现在运行:

import numpy as np
np.random.seed(4)

model2_init_weights = [np.random.random(), np.random.random(), np.random.random()]
model1_init_weights = [np.random.random(), np.random.random(), np.random.random()]
print("model1_init_weights:", model1_init_weights)
print("model2_init_weights:", model2_init_weights)

model1_init_weights: [0.7148159936743647, 0.6977288245972708, 0.21608949558037638]
model2_init_weights: [0.9670298390136767, 0.5472322491757223, 0.9726843599648843]

因此,翻转代码中 model1model2 的顺序也会翻转损失.这是因为种子不会在两个模型的定义之间自行重置,因此您的权重初始化完全不同.

Thus, flipping the order of model1 and model2 in your code also flips the losses. This is because the seed does not reset itself between the two models' definitions, so your weight initializations are totally different.

如果您希望它们相同,请在定义每个模型之前以及在拟合每个模型之前重置种子 - 并使用如下所示的方便功能.但最好的办法是重新启动内核并在单独的 .py 文件中工作.

If you wish them to be the same, reset the seed before defining EACH MODEL, and before FITTING each model - and use a handy function like below. But your best bet is to restart the kernel and work in separate .py files.

def reset_seeds():
    np.random.seed(1)
    random.seed(2)
    tf.set_random_seed(3)
    print("RANDOM SEEDS RESET")

这篇关于这些模型是等价的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆