损失函数适用于 reduce_mean 但不适用于 reduce_sum [英] Loss function works with reduce_mean but not reduce_sum

查看：75 发布时间：2021/9/5 18:55:24 tensorflow

本文介绍了损失函数适用于 reduce_mean 但不适用于 reduce_sum的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是张量流的新手，一直在查看此处的示例.我想将多层感知器分类模型重写为回归模型.但是我在修改损失函数时遇到了一些奇怪的行为.它适用于 tf.reduce_mean，但如果我尝试使用 tf.reduce_sum，它会在输出中给出 nan.这看起来很奇怪，因为函数非常相似 - 唯一的区别是平均值将总和结果除以元素数量?所以我看不出这种变化是如何引入 nan 的?

I'm new to tensor flow, and have been looking at the examples here. I wanted to rewrite the multilayer perceptron classification model to be a regression model. However I encountered some strange behaviour when modifying the loss function. It works fine with tf.reduce_mean, but if I try using tf.reduce_sum it gives nan's in the output. This seems very strange, as the functions are very similar - the only difference is that the mean divides the sum result by the number of elements? So I can't see how nan's could be introduced by this change?

import tensorflow as tf

# Parameters
learning_rate = 0.001

# Network Parameters
n_hidden_1 = 32 # 1st layer number of features
n_hidden_2 = 32 # 2nd layer number of features
n_input = 2 # number of inputs
n_output = 1 # number of outputs

# Make artificial data
SAMPLES = 1000
X = np.random.rand(SAMPLES, n_input)
T = np.c_[X[:,0]**2 + np.sin(X[:,1])]

# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_output])

# Create model
def multilayer_perceptron(x, weights, biases):
    # Hidden layer with tanh activation
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    layer_1 = tf.nn.tanh(layer_1)
    # Hidden layer with tanh activation
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    layer_2 = tf.nn.tanh(layer_2)
    # Output layer with linear activation
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
    return out_layer

# Store layers weight & bias
weights = {
    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, n_output]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'b2': tf.Variable(tf.random_normal([n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_output]))
}

pred = multilayer_perceptron(x, weights, biases)

# Define loss and optimizer
#se = tf.reduce_sum(tf.square(pred - y))   # Why does this give nans?
mse = tf.reduce_mean(tf.square(pred - y))  # When this doesn't?
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(mse)

# Initializing the variables
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)

training_epochs = 10
display_step = 1

# Training cycle
for epoch in range(training_epochs):
    avg_cost = 0.
    # Loop over all batches
    for i in range(100):
        # Run optimization op (backprop) and cost op (to get loss value)
        _, msev = sess.run([optimizer, mse], feed_dict={x: X, y: T})
    # Display logs per epoch step
    if epoch % display_step == 0:
        print("Epoch:", '%04d' % (epoch+1), "mse=", \
            "{:.9f}".format(msev))

有问题的变量 se 被注释掉了.它应该用来代替 mse.

The problematic variable se is commented out. It should be used in place of mse.

使用 mse 输出如下所示:

With mse the output looks like this:

Epoch: 0001 mse= 0.051669389
Epoch: 0002 mse= 0.031438075
Epoch: 0003 mse= 0.026629323
...

和 se 最终是这样的:

Epoch: 0001 se= nan
Epoch: 0002 se= nan
Epoch: 0003 se= nan
...

损失函数适用于 reduce_mean 但不适用于 reduce_sum [英] Loss function works with reduce_mean but not reduce_sum

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

损失函数适用于 reduce_mean 但不适用于 reduce_sum [英] Loss function works with reduce_mean but not reduce_sum

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭