这些模型是否等效? [英] Are these models equivalent?

查看:67
本文介绍了这些模型是否等效?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

主要问题:我用两种不同的方式定义相同的模型.为什么会得到不同的结果?它们似乎是相同的模型.

Main question: I define the same model in two different ways. Why do I get different results? They seem to be the same model.

第二个问题(在下面回答)如果再次运行代码,则会再次得到不同的结果.我已经在开始时设置了种子以修复随机性.为什么会这样?

Secondary question (answered below) If I run the code again, I get different results again. I have set the seed at the beginning to fix the randomness. Why is that happening?

import numpy as np
np.random.seed(1)
from keras.models import Model, Sequential
from keras.layers import Input, Dense

model1= Sequential([
     Dense(20, activation='sigmoid',kernel_initializer='glorot_normal', 
               input_shape=(2,)),
     Dense(2,  activation='linear', kernel_initializer='glorot_normal'),
])

model1.compile(optimizer='adam', loss='mean_squared_error')

ipt    = Input(shape=(2,))
x      = Dense(20, activation='sigmoid', kernel_initializer='glorot_normal')(ipt)
out    = Dense(2,  activation='linear',  kernel_initializer='glorot_normal')(x)
model2 = Model(ipt, out)

model2.compile(optimizer='adam', loss='mean_squared_error')

x_train=np.array([[1,2],[3,4],[3,4]])

model1.fit(x_train, x_train,epochs=2, validation_split=0.1, shuffle=False)
model2.fit(x_train, x_train,epochs=2, validation_split=0.1, shuffle=False)

第一次,输出是:

2/2 [==============================] - 0s 68ms/step - loss: 14.4394 - val_loss: 21.5747
Epoch 2/2

2/2 [==============================] - 0s 502us/step - loss: 14.3199 - val_loss: 21.4163
Train on 2 samples, validate on 1 samples
Epoch 1/2

2/2 [==============================] - 0s 72ms/step - loss: 11.0523 - val_loss: 17.7059
Epoch 2/2

2/2 [==============================] - 0s 491us/step - loss: 10.9833 - val_loss: 17.5785

第二次,输出为:

2/2 [==============================] - 0s 80ms/step - loss: 14.4394 - val_loss: 21.5747
Epoch 2/2

2/2 [==============================] - 0s 501us/step - loss: 14.3199 - val_loss: 21.4163
Train on 2 samples, validate on 1 samples
Epoch 1/2

2/2 [==============================] - 0s 72ms/step - loss: 11.0523 - val_loss: 17.6733
Epoch 2/2

2/2 [==============================] - 0s 485us/step - loss: 10.9597 - val_loss: 17.5459


阅读答案后进行更新:通过以下答案,我的一个问题已得到解答.我将代码的开头更改为:


Update after reading the answer: By the answer below, one of my questions has been answered. I changed the beginning of my code to:

import numpy as np
np.random.seed(1)
import random
random.seed(2)
import tensorflow as tf
tf.set_random_seed(3)

而且,现在我得到的数字与以前相同.因此,它是稳定的.但是,我的主要问题仍然没有答案.为什么两个相同的模型每次都会给出不同的结果?

And, now I am getting the same numbers as before. So, it is stable. But, my main question has remained unanswered. Why at each time, the two equivalent models give different results?

这是我每次得到的结果:

Here is the result I get every time:

结果1:

Epoch 1/2

2/2 [==============================] - 0s 66ms/sample - loss: 11.9794 - val_loss: 18.9925
Epoch 2/2

2/2 [==============================] - 0s 268us/sample - loss: 11.8813 - val_loss: 18.8572

结果2:

Epoch 1/2

2/2 [==============================] - 0s 67ms/sample - loss: 5.4743 - val_loss: 9.3471
Epoch 2/2

2/2 [==============================] - 0s 3ms/sample - loss: 5.4108 - val_loss: 9.2497

推荐答案

问题的根源在于模型定义和随机性的预期行为与实际行为.要查看发生了什么,我们必须了解"RNG"的工作原理:

The problem's rooted in the expected vs. actual behavior of model definition and randomness. To see what's going on, we must understand how "RNG" works:

  • 随机数生成器"(RNG)实际上是一种生成数字的函数,以便将它们映射到长期"的概率分布上.
  • 启用RNG功能时,例如调用RNG(),它返回一个随机"值,并且将其内部计数器加1 .将此计数器称为n-然后:random_value = RNG(n)
  • 设置SEED时,根据该种子的值(而不是该种子的)设置n;我们可以通过计数器中的+ c表示这种差异
  • c将是由种子的非线性但确定性的函数产生的常数:f(seed)
  • A "random number generator" (RNG) is actually a function that produces numbers such that they map onto a probability distribution 'in the long run'
  • When the RNG function, e.g. RNG() is called, it returns a "random" value and increments its internal counter by 1. Call this counter n - then: random_value = RNG(n)
  • When you set a SEED, you set n according to the value of that seed (but not to that seed); we can represent this difference via + c in the counter
  • c will be a constant produced by a non-linear, but deterministic, function of the seed: f(seed)
import numpy as np

np.random.seed(4)         # internal counter = 0 + c
print(np.random.random()) # internal counter = 1 + c
print(np.random.random()) # internal counter = 2 + c
print(np.random.random()) # internal counter = 3 + c

np.random.seed(4)         # internal counter = 0 + c
print(np.random.random()) # internal counter = 1 + c
print(np.random.random()) # internal counter = 2 + c
print(np.random.random()) # internal counter = 3 + c

0.9670298390136767
0.5472322491757223
0.9726843599648843

0.9670298390136767
0.5472322491757223
0.9726843599648843

假设model1具有100个权重,并且您设置了一个种子(n = 0 + c).构建model1后,您的计数器将位于100 + c.如果您不要重置种子,即使您使用完全相同的代码构建model2,则模型也会有所不同-初始化model2的权重从100 + c200 + c的每个n.

Suppose model1 has 100 weights, and you set a seed (n = 0 + c). After model1 is built, your counter is at 100 + c. If you don't reset the seed, even if you build model2 with the exact same code, the models will differ - as model2's weights are initialized per n from 100 + c to 200 + c.


其他信息:

三个种子以确保更好的随机性:

There are three seeds to ensure better randomness:

import numpy as np
np.random.seed(1)         # for Numpy ops
import random 
random.seed(2)            # for Python ops
import tensorflow as tf
tf.set_random_seed(3)     # for tensorfow ops - e.g. Dropout masks

这将提供很好的可重复性,但是如果您使用GPU,则不是完美的-由于操作的并行性; 此视频很好地说明了这一点.为了获得更好的再现性,请在官方 Keras常见问题解答.

This'll give pretty good reproducibility, but not perfect if you're using a GPU - due to parallelism of operations; this video explains it well. For even better reproducibility, set your PYHTONHASHSEED - that and other info in the official Keras FAQ.

完美"的可重复性是相当多余的,因为您的结果应该在.1%的大部分时间内达成一致-但是,如果您确实需要它,那么当前唯一的方法可能就是切换到CPU并停止使用CUDA-但这是会极大地减慢训练速度(x10 +).

"Perfect" reproducibility is rather redundant, as your results should agree within .1% majority of the time - but if you really need it, likely the only way currently is to switch to CPU and stop using CUDA - but that'll slow down training tremendously (by x10+).

随机性源:

  • 重量初始化(每个默认的Keras初始化程序都使用随机性)
  • 噪声层(降落,高斯噪声等)
  • 哈希用于基于哈希的操作,例如集合或字典中的项目顺序
  • GPU并行性(请参阅链接的视频)
  • Weight initializations (every default Keras initializer uses randomness)
  • Noise layers (Dropout, GaussianNoise, etc)
  • Hashing for hash-based operations, e.g. item order in a set or dict
  • GPU parallelism (see linked video)

模型随机性演示:

import numpy as np
np.random.seed(4)

model1_init_weights = [np.random.random(), np.random.random(), np.random.random()]
model2_init_weights = [np.random.random(), np.random.random(), np.random.random()]
print("model1_init_weights:", model1_init_weights)
print("model2_init_weights:", model2_init_weights)

model1_init_weights: [0.9670298390136767, 0.5472322491757223, 0.9726843599648843]
model2_init_weights: [0.7148159936743647, 0.6977288245972708, 0.21608949558037638]

重新启动内核.现在运行:

Restart kernel. Now run this:

import numpy as np
np.random.seed(4)

model2_init_weights = [np.random.random(), np.random.random(), np.random.random()]
model1_init_weights = [np.random.random(), np.random.random(), np.random.random()]
print("model1_init_weights:", model1_init_weights)
print("model2_init_weights:", model2_init_weights)

model1_init_weights: [0.7148159936743647, 0.6977288245972708, 0.21608949558037638]
model2_init_weights: [0.9670298390136767, 0.5472322491757223, 0.9726843599648843]

因此,翻转代码中的model1model2的顺序也将减少损失.这是因为种子不会在两个模型的定义之间重置自己,因此您的权重初始化完全不同.

Thus, flipping the order of model1 and model2 in your code also flips the losses. This is because the seed does not reset itself between the two models' definitions, so your weight initializations are totally different.

如果希望它们相同,请在定义每个模型之前和在拟合每个模型之前重置种子-并使用如下方便的功能.但是最好的选择是重新启动内核并在单独的.py文件中工作.

If you wish them to be the same, reset the seed before defining EACH MODEL, and before FITTING each model - and use a handy function like below. But your best bet is to restart the kernel and work in separate .py files.

def reset_seeds():
    np.random.seed(1)
    random.seed(2)
    tf.set_random_seed(3)
    print("RANDOM SEEDS RESET")

这篇关于这些模型是否等效?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆