超高成本的Tensorflow [英] Super high cost Tensorflow

查看:73
本文介绍了超高成本的Tensorflow的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Tensorflow对kaggle数据集进行一些价格预测. 我的神经网络正在学习,但是,我的成本函数非常高,而且我的预测与实际输出相差甚远. 我试图通过添加或删除一些层,神经元和激活功能来更改我的网络. 我用超参数做了很多尝试,但是并没有改变太多. 我不认为问题出在我的数据上,我检查了kaggle,这就是大多数人使用的数据.

I'm trying to make some price prediction on a kaggle dataset with Tensorflow. My Neural network is learning, but, my cost function is really high and my predictions are far from the real output. I tried to change my network by adding or removing some layers, neurons and activations functions. I tried a lot with my hyper-parameters but that don't change so much things. I don't think that the problem come from my datas, I checked on kaggle and that's the ones that most people uses.

如果您知道为什么我的成本如此之高以及如何降低成本,并且可以向我解释,那真是太好了!

If you have any idea why my cost is so high and how to reduce it and if you could explain it to me, it would be really great !

她是我的代码:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import tensorflow as tf
from sklearn.utils import shuffle

df = pd.read_csv(r"C:\Users\User\Documents\TENSORFLOW\Prediction prix\train2.csv", sep=';')
df.head()

df = df.loc[:, ['OverallQual', 'GrLivArea', 'GarageCars', 'TotalBsmtSF', 'FullBath', 'SalePrice']]

df = df.replace(np.nan, 0)

df

%matplotlib inline
plt = sns.pairplot(df)
plt

df = shuffle(df)

df_train = df[0:1000]
df_test = df[1001:1451]

inputX = df_train.drop('SalePrice', 1).as_matrix()
inputX = inputX.astype(int)

inputY = df_train.loc[:, ['SalePrice']].as_matrix()
inputY = inputY.astype(int)

inputX_test = df_test.drop('SalePrice', 1).as_matrix()
inputX_test = inputX_test.astype(int)

inputY_test = df_test.loc[:, ['SalePrice']].as_matrix()
inputY_test = inputY_test.astype(int)



# Parameters
learning_rate = 0.01
training_epochs = 1000
batch_size = 500
display_step = 50

n_samples = inputX.shape[0]


x = tf.placeholder(tf.float32, [None, 5])
y = tf.placeholder(tf.float32, [None, 1])


def add_layer(inputs, in_size, out_size, activation_function=None):
    Weights = tf.Variable(tf.random_normal([in_size, out_size], stddev=0.1))
    biases = tf.Variable(tf.zeros([1, out_size]) + 0.1)
    Wx_plus_b = tf.matmul(inputs, Weights) + biases
    if activation_function is None:
        output = Wx_plus_b
    else:
        output = activation_function(Wx_plus_b)
    return output


l1 = add_layer(x, 5, 3, activation_function=tf.nn.relu)

pred = add_layer(l1, 3, 1)


# Mean squared error
cost = tf.reduce_sum(tf.pow(pred-y, 2))/(2*n_samples)
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)


# Initializing the variables
init = tf.global_variables_initializer()


# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = batch_size
        # Loop over all batches
        for i in range(total_batch):
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c = sess.run([optimizer, cost], feed_dict={x: inputX,
                                                          y: inputY})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if epoch % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", \
                "{:.9f}".format(avg_cost))
    print("Optimization Finished!")

    # Test model
    correct_prediction = tf.equal(pred,y)
    # Calculate accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    print("Accuracy:", accuracy.eval({x: inputX, y: inputY}))
    print(sess.run(pred, feed_dict={x: inputX_test}))

Epoch:0001 cost = 10142407502702304395526144.000000000

Epoch: 0001 cost= 10142407502702304395526144.000000000

Epoch:0051 cost = 3256106752.000019550

Epoch: 0051 cost= 3256106752.000019550

Epoch:0101 cost = 3256106752.000019550

Epoch: 0101 cost= 3256106752.000019550

Epoch:0151 cost = 3256106752.000019550

Epoch: 0151 cost= 3256106752.000019550

Epoch:0201费用= 3256106752.000019550

Epoch: 0201 cost= 3256106752.000019550

...

感谢您的帮助!

推荐答案

我看到该实现有几个问题:

I see couple of problems with the implementation:

  1. 输入未缩放.
    使用sklearn StandardScaler缩放输入inputX,inputY(还包括inputX_text和inputY_text),使其均值和单位方差为零.您可以使用inverse_transform将输出再次转换回适当的比例.

  1. Inputs are not scaled.
    Use sklearn StandardScaler to scale the inputs inputX, inputY (and also inputX_text and inputY_text) to make it zero mean and unit variance. You can use the inverse_transform to convert the outputs back to proper scale again.

sc = StandardScaler().fit(inputX)
inputX = sc.transform(inputX)
inputX_test = sc.transform(inputX_test)

  • batch_size太大,您将整个集合作为单个批次传递.这不会引起您所面临的特定问题,但是为了更好的收敛,请尝试减小批次大小.实现get_batch()生成器函数并执行以下操作:

  • The batch_size is too large, you are passing the entire set as a single batch. This should not cause the particular problem you are facing, but for better convergence try with reduced batch size. Implement a get_batch() generator function and do the following:

    for batch_X, batch_Y in get_batch(input_X, input_Y, batch_size):
       _, c = sess.run([optimizer, cost], feed_dict={x: batch_X,
                                                  y: batch_Y})
    

  • 如果仍然遇到问题,请尝试使用较小的Weights初始化(stddev).
  • 下面的工作代码:

    inputX = df_train.drop('SalePrice', 1).as_matrix()
    inputX = inputX.astype(int)
    sc = StandardScaler().fit(inputX)
    inputX = sc.transform(inputX)
    
    inputY = df_train.loc[:, ['SalePrice']].as_matrix()
    inputY = inputY.astype(int)
    sc1 = StandardScaler().fit(inputY)
    inputY = sc1.transform(inputY)
    
    inputX_test = df_test.drop('SalePrice', 1).as_matrix()
    inputX_test = inputX_test.astype(int)
    inputX_test = sc.transform(inputX_test)
    
    inputY_test = df_test.loc[:, ['SalePrice']].as_matrix()
    inputY_test = inputY_test.astype(int)
    inputY_test = sc1.transform(inputY_test)
    
    learning_rate = 0.01
    training_epochs = 1000
    batch_size = 50
    display_step = 50
    
    n_samples = inputX.shape[0]
    
    x = tf.placeholder(tf.float32, [None, 5])
    y = tf.placeholder(tf.float32, [None, 1])
    
    def get_batch(inputX, inputY, batch_size):
      duration = len(inputX)
      for i in range(0,duration//batch_size):
        idx = i*batch_size
        yield inputX[idx:idx+batch_size], inputY[idx:idx+batch_size]
    
    
    def add_layer(inputs, in_size, out_size, activation_function=None):
      Weights = tf.Variable(tf.random_normal([in_size, out_size], stddev=0.005))
      biases = tf.Variable(tf.zeros([1, out_size]))
      Wx_plus_b = tf.matmul(inputs, Weights) + biases
      if activation_function is None:
        output = Wx_plus_b
      else:
        output = activation_function(Wx_plus_b)
      return output
    
    
    l1 = add_layer(x, 5, 3, activation_function=tf.nn.relu)
    
    pred = add_layer(l1, 3, 1)
    
    # Mean squared error
    cost = tf.reduce_mean(tf.pow(tf.subtract(pred, y), 2))
    # Gradient descent
    optimizer =   tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
    
    
    # Initializing the variables
    init = tf.global_variables_initializer()
    
    
    # Launch the graph
    with tf.Session() as sess:
     sess.run(init)
    
     # Training cycle
     for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = batch_size
        # Loop over all batches
        #for i in range(total_batch):
        for batch_x, batch_y in get_batch(inputX, inputY, batch_size):
            # Run optimization op (backprop) and cost op (to get loss value)
            _, c, _l1, _pred = sess.run([optimizer, cost, l1, pred], feed_dict={x: batch_x, y: batch_y})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if epoch % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f} ".format(avg_cost))
            #print(_l1, _pred)
    print("Optimization Finished!")
    

    这篇关于超高成本的Tensorflow的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆