使用Tensorflow在二进制分类中更改精度值且损失值不变 [英] Changing accuracy value and no change in loss value in binary classification using Tensorflow

查看:76
本文介绍了使用Tensorflow在二进制分类中更改精度值且损失值不变的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

am试图使用深度神经网络体系结构对二进制标签值-0和+1进行分类.这是我的代码在tensorflow中做到这一点.这个问题也从上一个问题

am trying to use a deep neural network architecture to classify against a binary label value - 0 and +1. Here is my code to do it in tensorflow. Also this question carries forward from the discussion in a previous question

import tensorflow as tf
import numpy as np
from preprocess import create_feature_sets_and_labels

train_x,train_y,test_x,test_y = create_feature_sets_and_labels()

x = tf.placeholder('float', [None, 5])
y = tf.placeholder('float')

n_nodes_hl1 = 500
n_nodes_hl2 = 500
# n_nodes_hl3 = 500

n_classes = 1
batch_size = 100

def neural_network_model(data):

    hidden_1_layer = {'weights':tf.Variable(tf.random_normal([5, n_nodes_hl1])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}

    hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}

    # hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
    #                   'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}

    # output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),
    #                   'biases':tf.Variable(tf.random_normal([n_classes]))}

    output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_classes])),
                    'biases':tf.Variable(tf.random_normal([n_classes]))}


    l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['biases'])
    l1 = tf.nn.relu(l1)

    l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['biases'])
    l2 = tf.nn.relu(l2)

    # l3 = tf.add(tf.matmul(l2, hidden_3_layer['weights']), hidden_3_layer['biases'])
    # l3 = tf.nn.relu(l3)

    # output = tf.transpose(tf.add(tf.matmul(l3, output_layer['weights']), output_layer['biases']))
    output = tf.add(tf.matmul(l2, output_layer['weights']), output_layer['biases'])
    return output



def train_neural_network(x):
    prediction = tf.sigmoid(neural_network_model(x))
    cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(prediction, y))
    optimizer = tf.train.AdamOptimizer().minimize(cost)

    hm_epochs = 10

    with tf.Session() as sess:
        sess.run(tf.initialize_all_variables())

        for epoch in range(hm_epochs):
            epoch_loss = 0
            i = 0
            while i < len(train_x):
                start = i
                end = i + batch_size
                batch_x = np.array(train_x[start:end])
        batch_y = np.array(train_y[start:end])

        _, c = sess.run([optimizer, cost], feed_dict={x: batch_x,
                                              y: batch_y})
        epoch_loss += c
        i+=batch_size

            print('Epoch', epoch, 'completed out of', hm_epochs, 'loss:', epoch_loss)

        # correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
        # accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
        predicted_class = tf.greater(prediction,0.5)
        correct = tf.equal(predicted_class, tf.equal(y,1.0))
        accuracy = tf.reduce_mean( tf.cast(correct, 'float') )

        # print (test_x.shape)
        # accuracy = tf.nn.l2_loss(prediction-y,name="squared_error_test_cost")/test_x.shape[0]
        print('Accuracy:', accuracy.eval({x: test_x, y: test_y}))

train_neural_network(x) 

具体来说,(继续进行上一个问题的讨论)我删除了一层-hidden_3_layer.已更改

Specifically, (carrying over the discussion from the previous question) I removed one layer - hidden_3_layer. Changed

预测= Neuro_network_model(x)

prediction = neural_network_model(x)

prediction = tf.sigmoid(neural_network_model(x))

,并根据尼尔的回答添加了predicted_class, correct, accuracy部分.我也将csv中的所有-1都更改为0.

and added the predicted_class, correct, accuracy part according to Neil's answer. I also changed all -1s to 0s in my csv.

这是我的踪迹:

('Epoch', 0, 'completed out of', 10, 'loss:', 37.312037646770477)
('Epoch', 1, 'completed out of', 10, 'loss:', 37.073578298091888)
('Epoch', 2, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 3, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 4, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 5, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 6, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 7, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 8, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 9, 'completed out of', 10, 'loss:', 37.035196363925934)
('Accuracy:', 0.42608696)

如您所见,损失并没有减少.因此,我不知道它是否仍在正常工作.

As you can see, the loss doesn't decrease. Hence I don't know if it is still working correctly.

这是多次重新运行的结果.结果摇摇欲坠:

Here are results from multiple re-runs. Results are swaying wildly:

('Epoch', 0, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 1, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 2, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 3, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 4, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 5, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 6, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 7, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 8, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 9, 'completed out of', 10, 'loss:', 26.513012945652008)
('Accuracy:', 0.60124224)

另一个:

('Epoch', 0, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 1, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 2, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 3, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 4, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 5, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 6, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 7, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 8, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 9, 'completed out of', 10, 'loss:', 22.873702049255371)
('Accuracy:', 1.0)

和另一个:

('Epoch', 0, 'completed out of', 10, 'loss:', 23.163824260234833)
('Epoch', 1, 'completed out of', 10, 'loss:', 22.88000351190567)
('Epoch', 2, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 3, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 4, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 5, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 6, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 7, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 8, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 9, 'completed out of', 10, 'loss:', 22.873702049255371)
('Accuracy:', 0.99627328)

我也看到了精度值为0.0 -_-

I have also seen accuracy value of 0.0 -_-

---------------编辑---------------

有关数据和数据处理的一些详细信息.我正在使用Yahoo!的IBM每日库存数据. 20年(几乎)的财务期.这大约相当于5200行条目.

Some details about data and data processing. I am using daily stock data for IBM from Yahoo! finance for a 20 year(almost) period. This amounts to roughly 5200 lines of entries.

这是我的处理方式:

import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import csv
import pickle

def create_feature_sets_and_labels(test_size = 0.2):
    df = pd.read_csv("ibm.csv")
    df = df.iloc[::-1]
    features = df.values
    testing_size = int(test_size*len(features))
    train_x = list(features[1:,1:6][:-testing_size])
    train_y = list(features[1:,7][:-testing_size])
    test_x = list(features[1:,1:6][-testing_size:])
    test_y = list(features[1:,7][-testing_size:])
    scaler = MinMaxScaler(feature_range=(-5,5))
    train_x = scaler.fit_transform(train_x)
    train_y = scaler.fit_transform(train_y)
    test_x = scaler.fit_transform(test_x)
    test_y = scaler.fit_transform(test_y)

    return train_x, train_y, test_x, test_y

if __name__ == "__main__":
    train_x, train_y, test_x, test_y = create_feature_sets_and_labels()
    with open('stockdata.pickle', 'wb') as f:
        pickle.dump([train_x, train_y, test_x, test_y], f)

第0列是日期.因此,它不用作功能.第7列也没有.我使用sklearnMinMaxScaler()在-5到5的范围内对数据进行了归一化.

column 0 is date. So that is not used as a feature. Nor is column 7. I normalized the data using sklearn's MinMaxScaler() over a range of -5 to 5.

-------------编辑2 -------------------

我注意到,当数据以非规范化形式呈现时,系统不会改变其准确性.

I've noticed that the system doesn't change its accuracy when data is presented in non-normalized form.

推荐答案

一旦在ML训练任务中将数据预处理为错误的形状或范围,其余数据流就会出错.您可以在问题的代码中以不同的方式多次执行此操作.

Once you pre-process your data into the wrong shape or range in a ML training task, the rest of the data flow will go wrong. You do this multiple times in different ways in the code in the question.

进行处理以进行处理.第一个问题是预处理.您的目标应该是:

Taking things in order that the processing occurs. The first problems are with pre-processing. Your goals here should be:

  • X值(输入要素)以表格形式显示,每一行都是一个示例,每一列都是一个要素.值应为数字,并按比例缩放以与神经网络一起使用.测试和训练数据需要按相同比例缩放-这并不意味着使用相同的.fit_transform,因为这可以重新适应缩放器.

  • X values (input features) in tabular form, each row is an example, each column is a feature. Values should be numeric and scaled for use with neural network. Test and train data need to be scaled identically - that doesn't mean using same .fit_transform because that re-fits the scaler.

Y值(输出标签),示例的每一行与X的同一行匹配,每一列是输出的真实值.对于分类问题,值通常为0和1,,并且不应重新缩放,因为它们表示类成员.

Y values (output labels) in tabular form, each row is example matching the same row of X, each column is the true value of an output. For classification problems the values are typically 0 and 1, and should not be re-scaled since they represent class membership.

此重新编写的create_feature_sets_and_labels函数可以正确执行以下操作:

This re-write of your create_feature_sets_and_labels function does things correctly:

def create_feature_sets_and_labels(test_size = 0.2):
    df = pd.read_csv("ibm.csv")
    df = df.iloc[::-1]
    features = df.values
    testing_size = int(test_size*len(features))

    train_x = np.array(features[1:,1:6][:-testing_size]).astype(np.float32)
    train_y = np.array(features[1:,7][:-testing_size]).reshape(-1, 1).astype(np.float32)

    test_x = np.array(features[1:,1:6][-testing_size:]).astype(np.float32)
    test_y = np.array(features[1:,7][-testing_size:]).reshape(-1, 1).astype(np.float32)

    scaler = MinMaxScaler(feature_range=(-5,5))

    scaler.fit(train_x)

    train_x = scaler.transform(train_x)
    test_x = scaler.transform(test_x)

    return train_x, train_y, test_x, test_y

与您的版本的重要差异:

Important differences from your version:

  • 使用类型转换np.array,而不是list(略有不同)

  • Using typecast np.array, not list (minor difference)

y值是表格形式的[n_examples, n_outputs](主要区别是,行向量形状稍后会引起许多问题)

y values are tabular [n_examples, n_outputs] (major difference, your row vector shape is cause of many problems later)

Scaler适合一次,然后应用于特征(主要区别是,如果分别缩放训练和测试数据,则不会预测任何有意义的东西)

Scaler is fit once then applied to features (major difference, if you scale train and test data separately, you are not predicting anything meaningful)

定标器应用于输出(分类器的主要区别,您希望将训练和测试值设为0.1,以实现有意义的训练和报告准确性)

Scaler is not applied to outputs (major difference for classifier, you want the train and test values to be 0,1 for meaningful training and reporting accuracy)

此数据的培训代码也存在一些问题:

There are also some problems with your training code for this data:

  • y = tf.placeholder('float')应该是y = tf.placeholder('float', [None, 1]).这对处理没有影响,但是当y是错误形状时会正确引发错误.该错误本来可以更早地指出出问题了.

  • y = tf.placeholder('float') should be y = tf.placeholder('float', [None, 1]). This makes no difference to processing, but correctly throws an error when y is the wrong shape. That error would have been a clue much earlier that things were going wrong.

n_nodes_hl1 = 500n_nodes_hl2 = 500可能要低得多,例如n_nodes_hl1 = 10n_nodes_hl2 = 10-这主要是因为您为权重使用了较大的初始值,因此您也可以按比例缩小权重,而对于更复杂的数据,您可能想要这样做.在这种情况下,减少隐藏神经元的数量会更简单.

n_nodes_hl1 = 500 and n_nodes_hl2 = 500 can be much lower, and the network will actually work much better with e.g. n_nodes_hl1 = 10 and n_nodes_hl2 = 10 - this is mainly because of you using large initial values for weights, you could alternatively scale the weights down, and for more complex data you might want to do that instead. In this case it is simpler to reduce number of hidden neurons.

正如我们在评论中讨论的那样,train_neural_network函数的开始应如下所示:

As we discussed in comments, the start of your train_neural_network function should look like this:

output = neural_network_model(x)
prediction = tf.sigmoid(output)
cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(output, y))
optimizer = tf.train.AdamOptimizer().minimize(cost)

. . .这是一个主要区别.通过使用sigmoid_cross_entropy_with_logits,您已承诺使用输出层的转换前值进行训练.但是,您仍然希望预测值能够测量准确性(或用于您要读取预测值的网络的任何其他用途).

. . . this is a major difference. By using sigmoid_cross_entropy_with_logits you have committed to using the pre-transform value of the output layer for training. But you still want the predicted values to measure accuracy (or for any other use of the network where you want to read off a predicted value).

为了始终如一地衡量损失,您希望每个示例具有均值损失,因此需要将每次均值的总和除以批数:'loss:', epoch_loss/(len(train_x)/batch_size)

For consistent measure of loss, you want to have mean loss per example, so you need to divide you sum of mean-per-batch by the number of batches: 'loss:', epoch_loss/(len(train_x)/batch_size)

如果我进行了所有这些更正,然后再运行几次,例如50,那么我得到一个典型的0.7损失和0.5的精确度测量值-这种情况相当可靠,但是由于起始重量的变化,它的确有一点移动.精度不是很稳定,并且可能会遭受过度拟合的影响,您根本不允许这样做(您应该阅读一些有助于测量和管理过度拟合的技术,这是可靠地训练NN的重要组成部分)

If I make all those corrections, and run this with a few more epochs - e.g. 50, then I get a typical loss of 0.7 and accuracy measure of 0.5 - and this occurs reasonably reliably, but does move a little due to changes in starting weights. The accuracy is not very stable, and possibly suffers from over-fit, which you are not allowing for at all (and you should read up on techniques to help measure and manage over-fit, it is an important part of training NNs reliably)

0.5的值似乎不好.通过修改网络体系结构或元参数,可以对其进行改进.例如,通过在隐藏层中将tf.nn.relu替换为tf.tanh并运行500个历元,我可以降低到0.43训练损失和达到0.83测试准确性.

The value of 0.5 may seem bad. It is possible to improve upon it, by modifying network architecture or meta-params. I can get down to 0.43 training loss and up to 0.83 test accuracy for example by swapping tf.nn.relu for tf.tanh in the hidden layers and running for 500 epochs.

要了解有关神经网络的更多信息,训练时要测量的内容以及模型中可能需要更改的内容,您将需要更深入地研究该主题.

To understand more about neural networks, what to measure when training and what might be worth changing in your model, you will want to study the subject in more depth.

这篇关于使用Tensorflow在二进制分类中更改精度值且损失值不变的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆