TensorFlow 中的二进制分类,损失和准确性的意外大值 [英] Binary classification in TensorFlow, unexpected large values for loss and accuracy

查看:20
本文介绍了TensorFlow 中的二进制分类,损失和准确性的意外大值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用深度神经网络架构对二进制标签值进行分类 - -1 和 +1.这是我在 tensorflow 中执行此操作的代码.

I am trying to use a deep neural network architecture to classify against a binary label value - -1 and +1. Here is my code to do it in tensorflow.

import tensorflow as tf
import numpy as np
from preprocess import create_feature_sets_and_labels

train_x,train_y,test_x,test_y = create_feature_sets_and_labels()

x = tf.placeholder('float', [None, 5])
y = tf.placeholder('float')

n_nodes_hl1 = 500
n_nodes_hl2 = 500
n_nodes_hl3 = 500

n_classes = 1
batch_size = 100

def neural_network_model(data):

    hidden_1_layer = {'weights':tf.Variable(tf.random_normal([5, n_nodes_hl1])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}

    hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}

    hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}

    output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),
                      'biases':tf.Variable(tf.random_normal([n_classes]))}


    l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['biases'])
    l1 = tf.nn.relu(l1)

    l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['biases'])
    l2 = tf.nn.relu(l2)

    l3 = tf.add(tf.matmul(l2, hidden_3_layer['weights']), hidden_3_layer['biases'])
    l3 = tf.nn.relu(l3)

    output = tf.transpose(tf.add(tf.matmul(l3, output_layer['weights']), output_layer['biases']))
    return output



def train_neural_network(x):
    prediction = neural_network_model(x)
    cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(prediction, y))
    optimizer = tf.train.AdamOptimizer().minimize(cost)

    hm_epochs = 10

    with tf.Session() as sess:
        sess.run(tf.initialize_all_variables())

        for epoch in range(hm_epochs):
            epoch_loss = 0
            i = 0
            while i < len(train_x):
                start = i
                end = i + batch_size
                batch_x = np.array(train_x[start:end])
                batch_y = np.array(train_y[start:end])

                _, c = sess.run([optimizer, cost], feed_dict={x: batch_x,
                                              y: batch_y})
                epoch_loss += c
                i+=batch_size

            print('Epoch', epoch, 'completed out of', hm_epochs, 'loss:', epoch_loss)

        # correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
        # accuracy = tf.reduce_mean(tf.cast(correct, 'float'))

        print (test_x.shape)
        accuracy = tf.nn.l2_loss(prediction-y,name="squared_error_test_cost")/test_x.shape[0]
        print('Accuracy:', accuracy.eval({x: test_x, y: test_y}))

train_neural_network(x)

这是我运行时得到的输出:

This is the output I get when I run this:

('Epoch', 0, 'completed out of', 10, 'loss:', -8400.2424869537354)
('Epoch', 1, 'completed out of', 10, 'loss:', -78980.956665039062)
('Epoch', 2, 'completed out of', 10, 'loss:', -152401.86713409424)
('Epoch', 3, 'completed out of', 10, 'loss:', -184913.46441650391)
('Epoch', 4, 'completed out of', 10, 'loss:', -165563.44775390625)
('Epoch', 5, 'completed out of', 10, 'loss:', -360394.44857788086)
('Epoch', 6, 'completed out of', 10, 'loss:', -475697.51550292969)
('Epoch', 7, 'completed out of', 10, 'loss:', -588638.92993164062)
('Epoch', 8, 'completed out of', 10, 'loss:', -745006.15966796875)
('Epoch', 9, 'completed out of', 10, 'loss:', -900172.41955566406)
(805, 5)
('Accuracy:', 5.8077128e+09)

我不明白我得到的值是否正确,因为非常缺乏非 MNIST 二进制分类示例.准确性与我预期的完全不同.我期待的是一个百分比而不是那么大的价值.

I don't understand if the values I am getting are correct as there is a real dearth of non-MNIST binary classification examples. The accuracy is nothing like what I expected. I was expecting a percentage instead of that large value.

我也有点不确定机器学习背后的理论,这就是为什么我无法使用 tensorflow 判断我的方法的正确性.

I am also somewhat unsure of the theory behind machine learning which is why I can't tell the correctness of my approach using tensorflow.

谁能告诉我我的二元分类方法是否正确?我的代码的准确性部分是否正确?

Can someone please tell me if my approach towards binary classification is correct? Also is the accuracy part of my code correct?

推荐答案

来自:

二进制标签值 - -1 和 +1

a binary label value - -1 and +1

...我假设您在 train_ytest_y 中的值实际上是 -1.0 和 +1.0

. . . I am assuming your values in train_y and test_y are actually -1.0 and +1.0

这对于您选择的损失函数 sigmoid_cross_entropy_with_logits 不会很好地工作 - 假设为 0.0 和 +1.0.负 y 值正在造成混乱!然而,损失函数的选择有利于二元分类.我建议将您的 y 值更改为 0 和 1.

This is not going to work very well with your chosen loss function sigmoid_cross_entropy_with_logits - which assumes 0.0 and +1.0. The negative y values are causing mayhem! However, the loss function choice is good for binary classification. I suggest change your y values to 0 and 1.

此外,从技术上讲,您的网络的输出并不是最终的预测.损失函数 sigmoid_cross_entropy_with_logits 旨在与输出层中具有 sigmoid 传递函数的网络一起使用,尽管您已经理解了损失函数是在之前应用的.所以你的训练代码看起来是正确的

In addition, technically the output of your network is not the final prediction. The loss function sigmoid_cross_entropy_with_logits is designed to work with a network with sigmoid transfer function in the output layer, although you have got it right that the loss function is applied before this is done. So your training code appears correct

虽然我对 tf.transpose 不是 100% 确定 - 我会看看如果你删除它会发生什么,我个人即

I'm not 100% sure about the tf.transpose though - I would see what happens if you remove that, personally I.e.

output = tf.add(tf.matmul(l3, output_layer['weights']), output_layer['biases'])

无论如何,这是logit"输出,而不是您的预测.对于非常有把握的预测,output 的值可能会很高,这可能解释了稍后由于缺少 sigmoid 函数而导致的非常高的值.所以添加一个预测张量(这表示示例属于正类的概率/置信度):

Either way, this is the "logit" output, but not your prediction. The value of output can get high for very confident predictions, which probably explains your very high values later due to missing the sigmoid function. So add a prediction tensor (this represents the probability/confidence that the example is in the positive class):

prediction = tf.sigmoid(output)

您可以使用它来计算准确度.您的准确度计算不应基于 L2 错误,而是基于正确值的总和 - 更接近您注释掉的代码(似乎来自多类分类).为了与二元分类的真/假进行比较,您需要对预测进行阈值处理,并与真实标签进行比较.像这样:

You can use that to calculate accuracy. Your accuracy calculation should not be based on L2 error, but sum of correct values - closer to the code you had commented out (which appears to be from a multiclass classification). For a comparison with true/false for binary classification, you need to threshold the predictions, and compare with the true labels. Something like this:

 predicted_class = tf.greater(prediction,0.5)
 correct = tf.equal(predicted_class, tf.equal(y,1.0))
 accuracy = tf.reduce_mean( tf.cast(correct, 'float') )

准确度值应介于 0.0 和 1.0 之间.如果你想要一个百分比,当然只需乘以 100.

The accuracy value should be between 0.0 and 1.0. If you want as a percentage, just multiply by 100 of course.

这篇关于TensorFlow 中的二进制分类,损失和准确性的意外大值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆