重置TensorFlow流指标的变量 [英] Resetting tensorflow streaming metrics' variables
问题描述
I have a bunch of streaming metrics (tf.metrics.accuracy
and custom streaming micro
, macro
and weighted
F1-scores).
在训练过程中,我得到了下面的图(不要担心过度拟合).
During training, I get the kind of plot below (nevermind the overfitting).
之所以会发生这种情况,是因为要计算验证集的指标,所以我称tf.local_variables_initializer
来重置指标,并且仅具有验证集的值.
This happens because to compute the validation set's metrics I call tf.local_variables_initializer
to reset the metrics and only have a value for the validation set.
这意味着2种副作用:
- 图像中的尖峰
- 在两次验证之间,即使每隔2次进行一次验证,训练指标也会不断汇总
我可以通过让不同的张量保存每个度量(火车与火车)来部分解决这种情况.但这无法解决
I could partially solve the situation by having different tensors hold each metric (train vs val). But It would not solve 2.
因此,我有 2个问题:
- 根据您的经验,这是您期望看到的一种行为(解决方案不是吗?)
- 是否有一种方法可以仅在最近的
n
批处理中流式处理指标?
- In your experience, is it a behavior you expect to see (or not? solution?)
- Is there a way to have metrics stream only over the last
n
batches?
推荐答案
如果您在两次训练之间重置了指标,则会出现此现象. 如果训练指标是两个不同的操作,则它们不会合并验证指标.我将举一个示例,说明如何保持这些指标不同,以及如何仅重置其中一个.
This behaviour is expected if you reset the metrics in between training. The train metrics dont agregrate the validation metrics if they are two different ops. I will give an example on how to keep those metrics different and how to reset only one of them.
玩具示例:
logits = tf.placeholder(tf.int64, [2,3])
labels = tf.Variable([[0, 1, 0], [1, 0, 1]])
#create two different ops
with tf.name_scope('train'):
train_acc, train_acc_op = tf.metrics.accuracy(labels=tf.argmax(labels, 1),
predictions=tf.argmax(logits,1))
with tf.name_scope('valid'):
valid_acc, valid_acc_op = tf.metrics.accuracy(labels=tf.argmax(labels, 1),
predictions=tf.argmax(logits,1))
培训:
#initialize the local variables has it holds the variables used for metrics calculation.
sess.run(tf.local_variables_initializer())
sess.run(tf.global_variables_initializer())
# initial state
print(sess.run(train_acc, {logits:[[0,1,0],[1,0,1]]}))
print(sess.run(valid_acc, {logits:[[0,1,0],[1,0,1]]}))
#0.0
#0.0
初始状态为0.0
.
现在调用训练操作指标:
Now calling the training op metrics:
#training loop
for _ in range(10):
sess.run(train_acc_op, {logits:[[0,1,0],[1,0,1]]})
print(sess.run(train_acc, {logits:[[0,1,0],[1,0,1]]}))
# 1.0
print(sess.run(valid_acc, {logits:[[0,1,0],[1,0,1]]}))
# 0.0
仅更新训练精度,而有效精度仍为0.0
.调用有效操作:
Only the training accuracy got updated while the valid accuracy is still 0.0
. Calling the valid ops:
for _ in range(10):
sess.run(valid_acc_op, {logits:[[0,1,0],[0,1,0]]})
print(sess.run(valid_acc, {logits:[[0,1,0],[1,0,1]]}))
#0.5
print(sess.run(train_acc, {logits:[[0,1,0],[1,0,1]]}))
#1.0
这里的有效精度已更新为新值,而训练精度保持不变.
Here the valid accuracy got updated to a new value while the training accuracy remained unchanged.
让我们仅重置验证操作:
Lets reset only the validation ops:
stream_vars_valid = [v for v in tf.local_variables() if 'valid/' in v.name]
sess.run(tf.variables_initializer(stream_vars_valid))
print(sess.run(valid_acc, {logits:[[0,1,0],[1,0,1]]}))
#0.0
print(sess.run(train_acc, {logits:[[0,1,0],[1,0,1]]}))
#1.0
有效准确度重置为零,而训练准确度保持不变.
The valid accuracy got reset to zero while the training accuracy remained unchanged.
这篇关于重置TensorFlow流指标的变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!