Tensorflow:相同随机种子的不同结果 [英] Tensorflow: Different results with the same random seed

查看:57
本文介绍了Tensorflow:相同随机种子的不同结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在 tensorflow 中实现的健身房环境 (BipedalWalker-v2) 中运行强化学习程序.我手动设置了环境的随机种子,tensorflow 和 numpy 如下

I'm running a reinforcement learning program in a gym environment(BipedalWalker-v2) implemented in tensorflow. I've set the random seed of the environment, tensorflow and numpy manually as follows

os.environ['PYTHONHASHSEED']=str(42)
random.seed(42)
np.random.seed(42)
tf.set_random_seed(42)

env = gym.make('BipedalWalker-v2')
env.seed(0)

config = tf.ConfigProto(intra_op_parallelism_threads=1, inter_op_parallelism_threads=1)
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
# run the graph with sess

但是,每次运行我的程序时都会得到不同的结果(不更改任何代码).为什么结果不一致,如果我想得到相同的结果怎么办?

However, I get different results every time I run my program (without changing any code). Why are the results not consistent and what should I do if I want to obtain the same result?

我能想到的唯一可能引入随机性的地方(神经网络除外)是

The only places that I can think of may introduce randomness (other than the neural networks) are

  1. 我使用tf.truncated_normal来生成随机噪声epsilon从而实现噪声层
  2. 我使用 np.random.uniform 从重放缓冲区中随机选择样本
  1. I use tf.truncated_normal to generate random noise epsilon so as to implement noisy layer
  2. I use np.random.uniform to randomly select samples from replay buffer

我还发现我在前 10 集中获得的分数非常一致,但随后开始有所不同.损失等其他方面也表现出类似的趋势,但在数字上有所不同.

I also spot that the scores I get are pretty consistent at the first 10 episodes, but then begin to differ. Other things such as losses also show a similar trend but are not the same in numeric.

我还设置了PYTHONHASHSEED"并使用@jaypops96 描述的单线程 CPU,但仍然无法重现结果.上面代码块中的代码已经更新

I've also set "PYTHONHASHSEED" and use single-thread CPU as @jaypops96 described, but still cannot reproduce the result. Code has been updated in the above code block

推荐答案

我建议检查您的 TensorFlow 图是否包含非确定性操作.例如,TensorFlow 1.2 之前的 reduce_sum 就是这样一种操作.这些操作是不确定的,因为浮点加法和乘法是非关联的(浮点数相加或相乘的顺序会影响结果),并且因为此类操作不能保证每次都以相同的顺序添加或相乘它们的输入.另请参阅这个问题.

I suggest checking whether your TensorFlow graph contains nondeterministic operations. For example, reduce_sum before TensorFlow 1.2 was one such operation. These operations are nondeterministic because floating-point addition and multiplication are nonassociative (the order in which floating-point numbers are added or multiplied affects the result) and because such operations don't guarantee their inputs are added or multiplied in the same order every time. See also this question.

编辑(2020 年 9 月 20 日):GitHub 存储库 framework-determinism 提供了有关机器学习框架(尤其是 TensorFlow)中不确定性来源的更多信息.

EDIT (Sep. 20, 2020): The GitHub repository framework-determinism has more information about sources of nondeterminism in machine learning frameworks, particularly TensorFlow.

这篇关于Tensorflow:相同随机种子的不同结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆