如何在 keras 中获得可重现的结果 [英] How to get reproducible results in keras

查看:105
本文介绍了如何在 keras 中获得可重现的结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

每次运行来自 Keras 框架 (https://github.com/fchollet/keras/blob/master/examples/imdb_lstm.py)在任何 keras 导入之前,代码在顶部包含 np.random.seed(1337).它应该防止它为每次运行生成不同的数字.我错过了什么?

I get different results (test accuracy) every time I run the imdb_lstm.py example from Keras framework (https://github.com/fchollet/keras/blob/master/examples/imdb_lstm.py) The code contains np.random.seed(1337) in the top, before any keras imports. It should prevent it from generating different numbers for every run. What am I missing?

更新:如何复制:

  1. 安装 Keras (http://keras.io/)
  2. 执行https://github.com/fchollet/keras/blob/master/examples/imdb_lstm.py 几次.它将训练模型并输出测试精度.
    预期结果:每次运行的测试准确度都相同.
    实际结果:每次运行的测试准确度都不同.
  1. Install Keras (http://keras.io/)
  2. Execute https://github.com/fchollet/keras/blob/master/examples/imdb_lstm.py a few times. It will train the model and output test accuracy.
    Expected result: Test accuracy is the same on every run.
    Actual result: Test accuracy is different on every run.

UPDATE2:我在 Windows 8.1 上使用 MinGW/msys 运行它,模块版本:
Theano 0.7.0
麻木 1.8.1
scipy 0.14.0c1

UPDATE2: I'm running it on Windows 8.1 with MinGW/msys, module versions:
theano 0.7.0
numpy 1.8.1
scipy 0.14.0c1

UPDATE3:我把问题缩小了一点.如果我使用 GPU 运行示例(设置 theano 标志 device=gpu0),那么我每次都会得到不同的测试准确度,但是如果我在 CPU 上运行它,那么一切都会按预期进行.我的显卡:NVIDIA GeForce GT 635)

UPDATE3: I narrowed the problem down a bit. If I run the example with GPU (set theano flag device=gpu0) then I get different test accuracy every time, but if I run it on CPU then everything works as expected. My graphics card: NVIDIA GeForce GT 635)

推荐答案

您可以在 Keras 文档中找到答案:https://keras.io/getting-started/faq/#how-can-i-obtain-reproducible-results-使用-keras-during-development.

You can find the answer at the Keras docs: https://keras.io/getting-started/faq/#how-can-i-obtain-reproducible-results-using-keras-during-development.

简而言之,要绝对确保您的 python 脚本在一台计算机/笔记本电脑的 CPU 上获得可重现的结果,那么您必须执行以下操作:

In short, to be absolutely sure that you will get reproducible results with your python script on one computer's/laptop's CPU then you will have to do the following:

  1. PYTHONHASHSEED 环境变量设置为固定值
  2. python内置的伪随机生成器设置为固定值
  3. numpy 伪随机生成器设置为固定值
  4. tensorflow 伪随机生成器设置为固定值
  5. 配置一个新的全局 tensorflow 会话
  1. Set the PYTHONHASHSEED environment variable at a fixed value
  2. Set the python built-in pseudo-random generator at a fixed value
  3. Set the numpy pseudo-random generator at a fixed value
  4. Set the tensorflow pseudo-random generator at a fixed value
  5. Configure a new global tensorflow session

按照顶部的 Keras 链接,我使用的源代码如下:

Following the Keras link at the top, the source code I am using is the following:

# Seed value
# Apparently you may use different seed values at each stage
seed_value= 0

# 1. Set the `PYTHONHASHSEED` environment variable at a fixed value
import os
os.environ['PYTHONHASHSEED']=str(seed_value)

# 2. Set the `python` built-in pseudo-random generator at a fixed value
import random
random.seed(seed_value)

# 3. Set the `numpy` pseudo-random generator at a fixed value
import numpy as np
np.random.seed(seed_value)

# 4. Set the `tensorflow` pseudo-random generator at a fixed value
import tensorflow as tf
tf.random.set_seed(seed_value)
# for later versions: 
# tf.compat.v1.set_random_seed(seed_value)

# 5. Configure a new global `tensorflow` session
from keras import backend as K
session_conf = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)
# for later versions:
# session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
# sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
# tf.compat.v1.keras.backend.set_session(sess)

不用说,您不必在 numpyscikit- 中指定任何 seedrandom_state学习 tensorflow/keras 函数,您在 Python 脚本中使用的正是因为使用上面的源代码,我们将它们的伪随机生成器全局设置为固定的价值.

It is needless to say that you do not have to to specify any seed or random_state at the numpy, scikit-learn or tensorflow/keras functions that you are using in your python script exactly because with the source code above we set globally their pseudo-random generators at a fixed value.

这篇关于如何在 keras 中获得可重现的结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆