TensorFlow:训练和测试集上的神经网络精度始终为100% [英] TensorFlow: Neural Network accuracy always 100% on train and test sets

查看:403
本文介绍了TensorFlow:训练和测试集上的神经网络精度始终为100%的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我创建了一个TensorFlow神经网络,该网络具有2个隐藏的层,每个层使用ReLU激活和权重的Xavier初始化,每个包含10个单位.输出层具有1个单元,该单元使用S形激活函数输出二进制分类(0或1),以基于输入特征对是否相信泰坦尼克号上的乘客幸存下来进行分类.

I created a TensorFlow neural network that has 2 hidden layers with 10 units each using ReLU activations and Xavier Initialization for the weights. The output layer has 1 unit outputting binary classification (0 or 1) using the sigmoid activation function to classify whether it believes a passenger on the titanic survived based on the input features.

(唯一省略的代码是load_data函数,该函数填充了稍后在程序中使用的变量X_train,Y_train,X_test,Y_test)

(The only code omitted is the load_data function which populates the variables X_train, Y_train, X_test, Y_test used later in the program)

参数

# Hyperparams 
learning_rate = 0.001
lay_dims = [10,10, 1]

# Other params
m = X_train.shape[1] 
n_x = X_train.shape[0]
n_y = Y_train.shape[0]

输入

X = tf.placeholder(tf.float32, shape=[X_train.shape[0], None], name="X")
norm = tf.nn.l2_normalize(X, 0) # normalize inputs

Y = tf.placeholder(tf.float32, shape=[Y_train.shape[0], None], name="Y")

初始化权重&偏向

Initialize Weights & Biases

W1 = tf.get_variable("W1", [lay_dims[0],n_x], initializer=tf.contrib.layers.xavier_initializer())
b1 = tf.get_variable("b1", [lay_dims[0],1], initializer=tf.zeros_initializer())

W2 = tf.get_variable("W2", [lay_dims[1],lay_dims[0]], initializer=tf.contrib.layers.xavier_initializer())
b2 = tf.get_variable("b2", [lay_dims[1],1], initializer=tf.zeros_initializer())

W3 = tf.get_variable("W3", [lay_dims[2],lay_dims[1]], initializer=tf.contrib.layers.xavier_initializer())
b3 = tf.get_variable("b3", [lay_dims[2],1], initializer=tf.zeros_initializer())

前向道具

Z1 = tf.add(tf.matmul(W1,X), b1)
A1 = tf.nn.relu(Z1)

Z2 = tf.add(tf.matmul(W2,A1), b2)
A2 = tf.nn.relu(Z2)

Y_hat = tf.add(tf.matmul(W3,A2), b3)

BackProp

cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=tf.transpose(Y_hat), labels=tf.transpose(Y)))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

会话

# Initialize
init = tf.global_variables_initializer()

with tf.Session() as sess:
    # Initialize
    sess.run(init)

    # Normalize Inputs
    sess.run(norm, feed_dict={X:X_train, Y:Y_train})

    # Forward/Backprob and update weights
    for i in range(10000):
        c, _ = sess.run([cost, optimizer], feed_dict={X:X_train, Y:Y_train})
        if i % 100 == 0:
            print(c)

    correct_prediction = tf.equal(tf.argmax(Y_hat), tf.argmax(Y))

    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

    print("Training Set:", sess.run(accuracy, feed_dict={X: X_train, Y: Y_train}))
    print("Testing Set:", sess.run(accuracy, feed_dict={X: X_test, Y: Y_test}))

运行 10,000 个 epoch 训练后,每次成本都在下降,说明 learning_rate 没问题,成本函数看起来正常.但是,在训练之后,我所有的Y_hat值(对训练集的预测)均为1(预测乘客幸存下来).因此,基本上,对于每个训练示例,预测仅输出y = 1.

After running running 10,000 epochs of training, the cost goes down each time so it shows that the learning_rate is okay and that the cost function appears normal. However, after training, all of my Y_hat values (predictions on the training set) are 1 (predicting the passenger survived). So basically the prediction just outputs y=1 for every training example.

此外,当我在Y_hat上运行tf.argmax时,结果是全0的矩阵.将tf.argmax应用于Y(地面事实标签)时,会发生相同的事情,这很奇怪,因为Y由训练示例中的所有正确标签组成.

Also, when I run tf.argmax on Y_hat, the result is a matrix of all 0's. The same thing is happening when tf.argmax is applied to Y (ground truth labels) which is odd because Y consists of all the correct labels for the training examples.

任何帮助将不胜感激.谢谢.

Any help is greatly appreciated. Thanks.

推荐答案

我假设您的Y_hat是一个(1,m)矩阵,其中m是训练示例的数量.然后 tf.argmax(Y_hat)将全为0.根据tensorflow文档,argmax

I assume your Y_hat is a (1,m) matrix with m is the number of training example. Then the tf.argmax(Y_hat) will give all 0. According to tensorflow documentation, argmax

返回张量轴上具有最大值的索引.

如果不沿轴传递,则将轴设置为0.由于轴0仅具有一个值,因此返回的索引始终会变为0.

If you do not pass in axis, the axis is set as 0. Because the axis 0 only has one value, the returned index becomes 0 all the time.

这篇关于TensorFlow:训练和测试集上的神经网络精度始终为100%的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆