如果Keras结果不可再现,那么比较模型和选择超参数的最佳实践是什么? [英] If Keras results are not reproducible, what's the best practice for comparing models and choosing hyper parameters?

查看:224
本文介绍了如果Keras结果不可再现,那么比较模型和选择超参数的最佳实践是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新:该问题适用于Tensorflow1.x.我升级到2.0,并且(至少在下面的简单代码上)重现性问题似乎在2.0上已解决.这样就解决了我的问题;但是我仍然对1.x上的此问题使用什么最佳实践"感到好奇.

在keras/tensorflow上训练完全相同的模型/参数/数据不会产生可重复的结果,并且每次训练模型时,损失都将显着不同.关于此问题有很多stackoverflow问题(例如,如何在keras中获得可复制的结果),但是建议的解决方法似乎对我或StackOverflow上的许多其他人员都无效.好的,就是这样.

Training the exact same model/parameters/data on keras/tensorflow does not give reproducible results and the loss is significantly different each time you train the model. There are many stackoverflow questions about that (eg, How to get reproducible results in keras ) but the recommend workarounds don't seem to work for me or many other people on StackOverflow. OK, it is what it is.

但是考虑到keras对张量流的不可复制性限制-比较模型和选择超参数的最佳实践是什么?我正在测试不同的体系结构和激活,但是由于每次损失估计都不同,所以我不确定一个模型是否优于另一个模型.有什么最佳做法可以解决这个问题?

But given that limitation of non-reproducibility with keras on tensorflow -- what's the best practice for comparing models and choosing hyper parameters? I'm testing different architectures and activations, but since the loss estimate is different each time, I'm never sure if one model is better than the other. Is there any best practice for dealing with this?

我认为这个问题与我的代码无关,但以防万一.这是一个示例程序:

I don't think the issue has anything to do with my code, but just in case it helps; here's a sample program:

import os
#stackoverflow says turning off the GPU helps reproducibility, but it doesn't help for me
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = ""
os.environ['PYTHONHASHSEED']=str(1)

import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers 
import random
import pandas as pd
import numpy as np

#StackOverflow says this is needed for reproducibility but it doesn't help for me
from tensorflow.keras import backend as K
config = tf.ConfigProto(intra_op_parallelism_threads=1,inter_op_parallelism_threads=1)
sess = tf.Session(graph=tf.get_default_graph(), config=config)
K.set_session(sess)

#make some random data
NUM_ROWS = 1000
NUM_FEATURES = 10
random_data = np.random.normal(size=(NUM_ROWS, NUM_FEATURES))
df = pd.DataFrame(data=random_data, columns=['x_' + str(ii) for ii in range(NUM_FEATURES)])
y = df.sum(axis=1) + np.random.normal(size=(NUM_ROWS))

def run(x, y):
    #StackOverflow says you have to set the seeds but it doesn't help for me
    tf.set_random_seed(1)
    np.random.seed(1)
    random.seed(1)
    os.environ['PYTHONHASHSEED']=str(1)

    model = keras.Sequential([
            keras.layers.Dense(40, input_dim=df.shape[1], activation='relu'),
            keras.layers.Dense(20, activation='relu'),
            keras.layers.Dense(10, activation='relu'),
            keras.layers.Dense(1, activation='linear')
        ])
    NUM_EPOCHS = 500
    model.compile(optimizer='adam', loss='mean_squared_error')
    model.fit(x, y, epochs=NUM_EPOCHS, verbose=0)
    predictions = model.predict(x).flatten()
    loss = model.evaluate(x,  y) #This prints out the loss by side-effect

#Each time we run it gives a wildly different loss. :-(
run(df, y)
run(df, y)
run(df, y)

鉴于不可重复性,我如何评估超参数和体系结构中的更改是否有帮助?

Given the non-reproducibility, how can I evaluate whether changes in my hyper-parameters and architecture are helping or not?

推荐答案

该问题似乎已在Tensorflow 2.0中解决(至少在简单模型上如此)!这是一个似乎可以产生可重复结果的代码段.

The problem appears to be solved in Tensorflow 2.0 (at least on simple models)! Here is a code snippet that seems to yield repeatable results.

import os
####*IMPORANT*: Have to do this line *before* importing tensorflow
os.environ['PYTHONHASHSEED']=str(1)

import tensorflow as tf
import tensorflow.keras as keras
import tensorflow.keras.layers 
import random
import pandas as pd
import numpy as np

def reset_random_seeds():
   os.environ['PYTHONHASHSEED']=str(1)
   tf.random.set_seed(1)
   np.random.seed(1)
   random.seed(1)

#make some random data
reset_random_seeds()
NUM_ROWS = 1000
NUM_FEATURES = 10
random_data = np.random.normal(size=(NUM_ROWS, NUM_FEATURES))
df = pd.DataFrame(data=random_data, columns=['x_' + str(ii) for ii in range(NUM_FEATURES)])
y = df.sum(axis=1) + np.random.normal(size=(NUM_ROWS))

def run(x, y):
    reset_random_seeds()

    model = keras.Sequential([
            keras.layers.Dense(40, input_dim=df.shape[1], activation='relu'),
            keras.layers.Dense(20, activation='relu'),
            keras.layers.Dense(10, activation='relu'),
            keras.layers.Dense(1, activation='linear')
        ])
    NUM_EPOCHS = 500
    model.compile(optimizer='adam', loss='mean_squared_error')
    model.fit(x, y, epochs=NUM_EPOCHS, verbose=0)
    predictions = model.predict(x).flatten()
    loss = model.evaluate(x,  y) #This prints out the loss by side-effect

#With Tensorflow 2.0 this is now reproducible! 
run(df, y)
run(df, y)
run(df, y)

这篇关于如果Keras结果不可再现,那么比较模型和选择超参数的最佳实践是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆