DropoutWrapper 在运行中是不确定的? [英] DropoutWrapper being non-deterministic across runs?

查看:23
本文介绍了DropoutWrapper 在运行中是不确定的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的代码的开头,(在 Session 的范围之外),我设置了我的随机种子 -

In the beginning of my code, (outside the scope of a Session), I've set my random seed -

np.random.seed(1)
tf.set_random_seed(1)

这就是我的辍学定义的样子 -

This is what my dropout definition looks like -

cell = tf.nn.rnn_cell.DropoutWrapper(cell, output_keep_prob=args.keep_prob, seed=1)

在我的第一个实验中,我保留了 keep_prob=1.获得的所有结果都是确定性的.我在多核 CPU 上运行它.

In my first experiment, I kept keep_prob=1. All results obtained were deterministic. I'm running this on a multicore CPU.

在我的第二个实验中,我设置了 keep_prob=0.8 并且我运行了两次相同的代码.每个代码都有这些语句,

In my second experiment, I set keep_prob=0.8 and I ran the same code two times. Each code had these statements,

sess.run(model.cost, feed)
sess.run(model.cost, feed)

第一次代码运行的结果 -

Results for first code run -

(Pdb) sess.run(model.cost, feed)
4.9555049
(Pdb) sess.run(model.cost, feed)
4.9548969

预期行为,因为 DropoutWrapper 使用 random_uniform.

Expected behaviour, since DropoutWrapper uses random_uniform.

第二次代码运行的结果 -

Results for second code run -

(Pdb) sess.run(model.cost, feed)
4.9551616
(Pdb) sess.run(model.cost, feed)
4.9552417

尽管定义了操作和图形种子,为什么这个序列与第一个输出不相同?

Why is this sequence not identical to the first output despite defining an operation and graph seed?

推荐答案

答案已经在评论中提供了,但是还没有人明确写出来,所以这里是:

The answer was already provided in the comments, but no-one has written it explicitly yet, so here it is:

dynamic_rnn 将在内部使用 tf.while_loop,它实际上可以并行评估多个迭代(请参阅有关 parallel_iterations 的文档).实际上,如果 loop-body 或 loop-cond 中的所有内容都依赖于先前的值,则它无法并行运行任何内容,但可能存在不依赖于先前值的计算.这些将被并行评估.在您的情况下,在 DropoutWrapper 中,您有时会像这样:

dynamic_rnn will internally use tf.while_loop, which can actually evaluate multiple iterations in parallel (see documentation on parallel_iterations). In practice, if everything inside the loop-body or loop-cond depends on the previous values, it cannot run anything in parallel but there could be computations which don't depend on the previous values. These will be evaluated in parallel. In your case, inside the DropoutWrapper, you have at some point sth like this:

random_ops.random_uniform(noise_shape, ...)

此操作独立于循环的先前值,因此可以针对所有时间步并行计算.如果您进行这样的并行执行,那么哪个时间范围获得哪个 dropout 掩码将是不确定的.

This operation is independent from the previous values of the loop, so it can be calculated in parallel for all time-steps. If you do such parallel execution, it will be non-deterministic which time-frame gets which dropout mask.

这篇关于DropoutWrapper 在运行中是不确定的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆