张量流中的MLP进行回归...不收敛 [英] MLP in tensorflow for regression... not converging

查看:498
本文介绍了张量流中的MLP进行回归...不收敛的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你好,这是我第一次使用tensorflow,我尝试在此处修改示例

Hello it is my first time working with tensorflow, i try to adapt the example here TensorFlow-Examples to use this code for regression problems with boston database. Basically, i only change the cost function ,the database, the inputs number, and the target number but when i run the MPL doesn't converge (i use a very low rate). I test it with Adam Optimization and descend gradient optimization but i have the same behavior. I appreciate your suggestions and ideas...!!!

观察:当我运行此程序但未做上述修改时,成本函数值始终减小.

这里是我运行模型时的演化过程,即使学习率很低,成本函数也会振荡.在最坏的情况下,我希望模型收敛于一个值,例如纪元944显示一个值0.2267548找到更好的值,然后必须保持该值,直到优化完成为止.

Here the evolution when i run the model, the cost function oscillated even with a very low learning rate.In the worst case, i hope the model converge in a value, for example the epoch 944 shows a value 0.2267548 if not other better value is find then this value must stay until the optimization is finished.

Epoch:0942年费用= 0.445707272

Epoch: 0942 cost= 0.445707272

Epoch:0943 cost = 0.389314095

Epoch: 0943 cost= 0.389314095

Epoch:0944 cost = 0.226754842

Epoch: 0944 cost= 0.226754842

Epoch:0945费用= 0.404150135

Epoch: 0945 cost= 0.404150135

Epoch:0946费用= 0.382190095

Epoch: 0946 cost= 0.382190095

Epoch:0947成本= 0.897880572

Epoch: 0947 cost= 0.897880572

Epoch:0948成本= 0.481954243

Epoch: 0948 cost= 0.481954243

Epoch:0949成本= 0.269408980

Epoch: 0949 cost= 0.269408980

Epoch:0950成本= 0.427961614

Epoch: 0950 cost= 0.427961614

Epoch:0951成本= 1.206053280

Epoch: 0951 cost= 1.206053280

Epoch:0952费用= 0.834200084

Epoch: 0952 cost= 0.834200084

from __future__ import print_function

# Import MNIST data
#from tensorflow.examples.tutorials.mnist import input_data
#mnist = input_data.read_data_sets("/tmp/data/", one_hot=True)

import tensorflow as tf
import  ToolInputData as input_data

ALL_DATA_FILE_NAME = "boston_normalized.csv"



##Load complete database, then this database is splitted in training,   validation and test set
completedDatabase = input_data.Databases(databaseFileName=ALL_DATA_FILE_NAME,     targetLabel="MEDV", trainPercentage=0.70, valPercentage=0.20, testPercentage=0.10,
                  randomState=42, inputdataShuffle=True, batchDataShuffle=True)


# Parameters
learning_rate = 0.0001
training_epochs = 1000
batch_size = 5
display_step = 1

# Network Parameters
n_hidden_1 = 10 # 1st layer number of neurons
n_hidden_2 = 10 # 2nd layer number of neurons

n_input = 13 # number of features of my database
n_classes = 1 # one target value (float)

# tf Graph input
x = tf.placeholder("float", [None, n_input])
y = tf.placeholder("float", [None, n_classes])


# Create model
def multilayer_perceptron(x, weights, biases):
    # Hidden layer with RELU activation
    layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
    layer_1 = tf.nn.relu(layer_1)
    # Hidden layer with RELU activation
    layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
    layer_2 = tf.nn.relu(layer_2)
    # Output layer with linear activation
    out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
   return out_layer

# Store layers weight & bias
weights = {
    'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1])),
    'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_hidden_2, n_classes]))
}
biases = {
    'b1': tf.Variable(tf.random_normal([n_hidden_1])),
    'b2': tf.Variable(tf.random_normal([n_hidden_2])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}

# Construct model
pred = multilayer_perceptron(x, weights, biases)

# Define loss and optimizer
cost = tf.reduce_mean(tf.square(pred-y))
#cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
optimizer =  tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

# Initializing the variables
init = tf.initialize_all_variables()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(completedDatabase.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_x, batch_y = completedDatabase.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_x,
                                                      y: batch_y})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if epoch % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", \
                "{:.9f}".format(avg_cost))
    print("Optimization Finished!")    

推荐答案

您说过,标签在[0,1]范围内,但我看不到预测值在同一范围内.为了使它们与标签具有可比性,您应在返回之前将它们转换为相同的范围,例如使用sigmoid函数:

You stated that your labels are in the range [0,1], but I cannot see that the predictions are in the same range. In order to make them comparable to the labels, you should transform them to the same range before returning, for example using the sigmoid function:

out_layer = tf.matmul(...)
out = tf.sigmoid(out_layer)
return out

也许这可以解决稳定性问题.您可能还需要稍微增加批处理大小,例如每批20个示例.如果这样可以提高性能,则可以稍微提高学习速度.

Maybe this fixes the problem with the stability. You might also want to increase the batch size a bit, for example 20 examples per batch. If this improves the performance, you can probably increase the learning rate a bit.

这篇关于张量流中的MLP进行回归...不收敛的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆