使用 Tensorflow 在二元分类中改变精度值和损失值不变 [英] Changing accuracy value and no change in loss value in binary classification using Tensorflow

查看:32
本文介绍了使用 Tensorflow 在二元分类中改变精度值和损失值不变的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用深度神经网络架构对二进制标签值进行分类 - 0 和 +1.这是我在 tensorflow 中执行此操作的代码.这个问题也继承自 上一个问题

 将 tensorflow 导入为 tf将 numpy 导入为 np从预处理导入 create_feature_sets_and_labelstrain_x,train_y,test_x,test_y = create_feature_sets_and_labels()x = tf.placeholder('float', [None, 5])y = tf.placeholder('浮动')n_nodes_hl1 = 500n_nodes_hl2 = 500# n_nodes_hl3 = 500n_classes = 1批量大小 = 100定义神经网络模型(数据):hidden_​​1_layer = {'weights':tf.Variable(tf.random_normal([5, n_nodes_hl1])),'偏差':tf.Variable(tf.random_normal([n_nodes_hl1]))}hidden_​​2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),'偏差':tf.Variable(tf.random_normal([n_nodes_hl2]))}# hidden_​​3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),#'偏差':tf.Variable(tf.random_normal([n_nodes_hl3]))}# output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),#'偏差':tf.Variable(tf.random_normal([n_classes]))}output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_classes])),'偏差':tf.Variable(tf.random_normal([n_classes]))}l1 = tf.add(tf.matmul(data, hidden_​​1_layer['weights']), hidden_​​1_layer['biases'])l1 = tf.nn.relu(l1)l2 = tf.add(tf.matmul(l1, hidden_​​2_layer['weights']), hidden_​​2_layer['biases'])l2 = tf.nn.relu(l2)# l3 = tf.add(tf.matmul(l2, hidden_​​3_layer['weights']), hidden_​​3_layer['biases'])# l3 = tf.nn.relu(l3)# output = tf.transpose(tf.add(tf.matmul(l3, output_layer['weights']), output_layer['biases']))output = tf.add(tf.matmul(l2, output_layer['weights']), output_layer['biases'])返回输出def train_neural_network(x):预测 = tf.sigmoid(neural_network_model(x))成本 = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(预测,y))优化器 = tf.train.AdamOptimizer().minimize(cost)hm_epochs = 10使用 tf.Session() 作为 sess:sess.run(tf.initialize_all_variables())对于范围内的纪元(hm_epochs):epoch_loss = 0我 = 0当我 <len(train_x):开始 = 我end = i + batch_sizebatch_x = np.array(train_x[start:end])batch_y = np.array(train_y[start:end])_, c = sess.run([优化器, 成本], feed_dict={x: batch_x,y:batch_y})epoch_loss += ci+=batch_size打印('纪元',纪元,'完成了',hm_epochs,'损失:',纪元损失)# 正确 = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))# 精度 = tf.reduce_mean(tf.cast(correct, 'float'))预测类= tf.greater(预测,0.5)正确 = tf.equal(predicted_class, tf.equal(y,1.0))精度 = tf.reduce_mean( tf.cast(correct, 'float') )# 打印(test_x.shape)# 准确率 = tf.nn.l2_loss(prediction-y,name="squared_error_test_cost")/test_x.shape[0]打印('准确度:',accuracy.eval({x:test_x,y:test_y}))train_neural_network(x)

具体来说,(延续上一个问题的讨论)我删除了一层 - hidden_​​3_layer.改变了

<块引用>

预测=neural_network_model(x)

<块引用>

预测 = tf.sigmoid(neural_network_model(x))

并根据 Neil 的回答添加了 predicted_class,correct,accuracy 部分.我还在我的 csv 中将所有 -1 更改为 0.

这是我的踪迹:

('Epoch', 0, 'completed out of', 10, 'loss:', 37.312037646770477)('Epoch', 1, 'completed out of', 10, 'loss:', 37.073578298091888)('Epoch', 2, 'completed out of', 10, 'loss:', 37.035196363925934)('Epoch', 3, 'completed out of', 10, 'loss:', 37.035196363925934)('Epoch', 4, 'completed out of', 10, 'loss:', 37.035196363925934)('Epoch', 5, 'completed out of', 10, 'loss:', 37.035196363925934)('Epoch', 6, 'completed out of', 10, 'loss:', 37.035196363925934)('Epoch', 7, 'completed out of', 10, 'loss:', 37.035196363925934)('Epoch', 8, 'completed out of', 10, 'loss:', 37.035196363925934)('Epoch', 9, 'completed out of', 10, 'loss:', 37.035196363925934)('准确度:', 0.42608696)

如您所见,损失并没有减少.因此我不知道它是否仍然正常工作.

以下是多次重新运行的结果.结果摇摆不定:

('Epoch', 0, 'completed out of', 10, 'loss:', 26.513012945652008)('Epoch', 1, 'completed out of', 10, 'loss:', 26.513012945652008)('Epoch', 2, 'completed out of', 10, 'loss:', 26.513012945652008)('Epoch', 3, 'completed out of', 10, 'loss:', 26.513012945652008)('Epoch', 4, 'completed out of', 10, 'loss:', 26.513012945652008)('Epoch', 5, 'completed out of', 10, 'loss:', 26.513012945652008)('Epoch', 6, 'completed out of', 10, 'loss:', 26.513012945652008)('Epoch', 7, 'completed out of', 10, 'loss:', 26.513012945652008)('Epoch', 8, 'completed out of', 10, 'loss:', 26.513012945652008)('Epoch', 9, 'completed out of', 10, 'loss:', 26.513012945652008)('准确度:', 0.60124224)

另一个:

('Epoch', 0, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 1, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 2, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 3, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 4, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 5, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 6, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 7, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 8, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 9, 'completed out of', 10, 'loss:', 22.873702049255371)('准确度:', 1.0)

和另一个:

('Epoch', 0, 'completed out of', 10, 'loss:', 23.163824260234833)('Epoch', 1, 'completed out of', 10, 'loss:', 22.88000351190567)('Epoch', 2, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 3, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 4, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 5, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 6, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 7, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 8, 'completed out of', 10, 'loss:', 22.873702049255371)('Epoch', 9, 'completed out of', 10, 'loss:', 22.873702049255371)('准确度:', 0.99627328)

我也看到了 0.0 -_- 的准确度值

--------------编辑--------------

有关数据和数据处理的一些详细信息.我正在使用来自雅虎的 IBM 每日股票数据!20 年(几乎)期间的融资.这相当于大约 5200 行条目.

这是我的处理方式:

将 numpy 导入为 np将熊猫导入为 pd从 sklearn.preprocessing 导入 MinMaxScaler导入 csv进口泡菜def create_feature_sets_and_labels(test_size = 0.2):df = pd.read_csv("ibm.csv")df = df.iloc[::-1]特征 = df.valuestesting_size = int(test_size*len(features))train_x = 列表(特征[1:,1:6][:-testing_size])train_y = 列表(特征[1:,7][:-testing_size])test_x = 列表(功能[1:,1:6][-testing_size:])test_y = 列表(功能[1:,7][-testing_size:])缩放器 = MinMaxScaler(feature_range=(-5,5))train_x = scaler.fit_transform(train_x)train_y = scaler.fit_transform(train_y)test_x = scaler.fit_transform(test_x)test_y = scaler.fit_transform(test_y)返回 train_x、train_y、test_x、test_y如果 __name__ == "__main__":train_x, train_y, test_x, test_y = create_feature_sets_and_labels()with open('stockdata.pickle', 'wb') as f:pickle.dump([train_x, train_y, test_x, test_y], f)

第 0 列是日期.所以这不用作功能.第 7 列也不是.我使用 sklearnMinMaxScaler() 在 -5 到 5 的范围内对数据进行了标准化.

-------------EDIT 2--------------------

我注意到,当数据以非规范化形式呈现时,系统不会改变其准确性.

解决方案

一旦您在 ML 训练任务中将数据预处理为错误的形状或范围,其余的数据流就会出错.您在问题的代码中以不同的方式多次执行此操作.

为了处理发生而采取措施.第一个问题是预处理.您的目标应该是:

  • 表格形式的X值(输入特征),每一行是一个例子,每一列是一个特征.值应该是数字并按比例缩放以与神经网络一起使用.测试和训练数据需要以相同的方式缩放 - 这并不意味着使用相同的 .fit_transform,因为这会重新拟合缩放器.

  • Y 值(输出标签)以表格形式显示,每一行是与 X 相同行匹配的示例,每一列是输出的真实值.对于分类问题,值通常为 0 和 1,不应重新缩放,因为它们代表类成员.

对您的 create_feature_sets_and_labels 函数的这种重写可以正确执行:

def create_feature_sets_and_labels(test_size = 0.2):df = pd.read_csv("ibm.csv")df = df.iloc[::-1]特征 = df.valuestesting_size = int(test_size*len(features))train_x = np.array(features[1:,1:6][:-testing_size]).astype(np.float32)train_y = np.array(features[1:,7][:-testing_size]).reshape(-1, 1).astype(np.float32)test_x = np.array(features[1:,1:6][-testing_size:]).astype(np.float32)test_y = np.array(features[1:,7][-testing_size:]).reshape(-1, 1).astype(np.float32)缩放器 = MinMaxScaler(feature_range=(-5,5))scaler.fit(train_x)train_x = scaler.transform(train_x)test_x = scaler.transform(test_x)返回 train_x、train_y、test_x、test_y

与您的版本的重要差异:

  • 使用类型转换 np.array,而不是 list(细微差别)

  • y 值是表格形式的 [n_examples, n_outputs](主要区别,你的行向量形状是以后很多问题的原因)

  • Scaler 适合一次然后应用于特征(主要区别,如果分别缩放训练和测试数据,则预测没有任何意义)

  • Scaler 应用于输出(分类器的主要区别,您希望训练和测试值为 0,1 以实现有意义的训练和报告准确性)

您针对此数据的训练代码也存在一些问题:

  • y = tf.placeholder('float') 应该是 y = tf.placeholder('float', [None, 1]).这对处理没有影响,但是当 y 是错误的形状时会正确地抛出错误.该错误本可以更早地表明事情出错了.

  • n_nodes_hl1 = 500n_nodes_hl2 = 500 可以低得多,并且网络实际上会更好地工作,例如n_nodes_hl1 = 10n_nodes_hl2 = 10 - 这主要是因为您对权重使用了较大的初始值,您也可以缩小权重,对于更复杂的数据,您可能想要这样做.在这种情况下,减少隐藏神经元的数量会更简单.

  • 正如我们在评论中所讨论的,train_neural_network 函数的开始应该是这样的:

    output = neural_network_model(x)预测 = tf.sigmoid(输出)成本 = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(output, y))优化器 = tf.train.AdamOptimizer().minimize(cost)

    ...这是一个主要区别.通过使用 sigmoid_cross_entropy_with_logits,您已承诺使用输出层的转换前值进行训练.但是您仍然希望预测值能够衡量准确度(或用于您想要读取预测值的网络的任何其他用途).

  • 为了一致的损失度量,您希望每个示例都有平均损失,因此您需要将每批次均值的总和除以批次数:'loss:', epoch_loss/(len(train_x)/batch_size)

如果我进行了所有这些更正,然后再运行几个时期 - 例如50,然后我得到 0.7 的典型损失和 0.5 的准确度测量 - 这发生得相当可靠,但由于起始权重的变化确实移动了一点.准确性不是很稳定,并且可能会出现过度拟合,这是您根本不允许的(并且您应该阅读有助于测量和管理过度拟合的技术,它是可靠地训练 NN 的重要部分)

0.5 的值可能看起来很糟糕.可以通过修改网络架构或元参数来改进它.例如,通过将 tf.nn.relu 替换为 tf.tanh 在隐藏层中运行 500 个 epochs.

要了解有关神经网络的更多信息、训练时要测量的内容以及模型中可能值得更改的内容,您需要更深入地研究该主题.

am trying to use a deep neural network architecture to classify against a binary label value - 0 and +1. Here is my code to do it in tensorflow. Also this question carries forward from the discussion in a previous question

import tensorflow as tf
import numpy as np
from preprocess import create_feature_sets_and_labels

train_x,train_y,test_x,test_y = create_feature_sets_and_labels()

x = tf.placeholder('float', [None, 5])
y = tf.placeholder('float')

n_nodes_hl1 = 500
n_nodes_hl2 = 500
# n_nodes_hl3 = 500

n_classes = 1
batch_size = 100

def neural_network_model(data):

    hidden_1_layer = {'weights':tf.Variable(tf.random_normal([5, n_nodes_hl1])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}

    hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
                      'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}

    # hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
    #                   'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}

    # output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),
    #                   'biases':tf.Variable(tf.random_normal([n_classes]))}

    output_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_classes])),
                    'biases':tf.Variable(tf.random_normal([n_classes]))}


    l1 = tf.add(tf.matmul(data, hidden_1_layer['weights']), hidden_1_layer['biases'])
    l1 = tf.nn.relu(l1)

    l2 = tf.add(tf.matmul(l1, hidden_2_layer['weights']), hidden_2_layer['biases'])
    l2 = tf.nn.relu(l2)

    # l3 = tf.add(tf.matmul(l2, hidden_3_layer['weights']), hidden_3_layer['biases'])
    # l3 = tf.nn.relu(l3)

    # output = tf.transpose(tf.add(tf.matmul(l3, output_layer['weights']), output_layer['biases']))
    output = tf.add(tf.matmul(l2, output_layer['weights']), output_layer['biases'])
    return output



def train_neural_network(x):
    prediction = tf.sigmoid(neural_network_model(x))
    cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(prediction, y))
    optimizer = tf.train.AdamOptimizer().minimize(cost)

    hm_epochs = 10

    with tf.Session() as sess:
        sess.run(tf.initialize_all_variables())

        for epoch in range(hm_epochs):
            epoch_loss = 0
            i = 0
            while i < len(train_x):
                start = i
                end = i + batch_size
                batch_x = np.array(train_x[start:end])
        batch_y = np.array(train_y[start:end])

        _, c = sess.run([optimizer, cost], feed_dict={x: batch_x,
                                              y: batch_y})
        epoch_loss += c
        i+=batch_size

            print('Epoch', epoch, 'completed out of', hm_epochs, 'loss:', epoch_loss)

        # correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(y, 1))
        # accuracy = tf.reduce_mean(tf.cast(correct, 'float'))
        predicted_class = tf.greater(prediction,0.5)
        correct = tf.equal(predicted_class, tf.equal(y,1.0))
        accuracy = tf.reduce_mean( tf.cast(correct, 'float') )

        # print (test_x.shape)
        # accuracy = tf.nn.l2_loss(prediction-y,name="squared_error_test_cost")/test_x.shape[0]
        print('Accuracy:', accuracy.eval({x: test_x, y: test_y}))

train_neural_network(x) 

Specifically, (carrying over the discussion from the previous question) I removed one layer - hidden_3_layer. Changed

prediction = neural_network_model(x)

to

prediction = tf.sigmoid(neural_network_model(x))

and added the predicted_class, correct, accuracy part according to Neil's answer. I also changed all -1s to 0s in my csv.

This is my trace:

('Epoch', 0, 'completed out of', 10, 'loss:', 37.312037646770477)
('Epoch', 1, 'completed out of', 10, 'loss:', 37.073578298091888)
('Epoch', 2, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 3, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 4, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 5, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 6, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 7, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 8, 'completed out of', 10, 'loss:', 37.035196363925934)
('Epoch', 9, 'completed out of', 10, 'loss:', 37.035196363925934)
('Accuracy:', 0.42608696)

As you can see, the loss doesn't decrease. Hence I don't know if it is still working correctly.

Here are results from multiple re-runs. Results are swaying wildly:

('Epoch', 0, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 1, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 2, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 3, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 4, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 5, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 6, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 7, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 8, 'completed out of', 10, 'loss:', 26.513012945652008)
('Epoch', 9, 'completed out of', 10, 'loss:', 26.513012945652008)
('Accuracy:', 0.60124224)

another:

('Epoch', 0, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 1, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 2, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 3, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 4, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 5, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 6, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 7, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 8, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 9, 'completed out of', 10, 'loss:', 22.873702049255371)
('Accuracy:', 1.0)

and another:

('Epoch', 0, 'completed out of', 10, 'loss:', 23.163824260234833)
('Epoch', 1, 'completed out of', 10, 'loss:', 22.88000351190567)
('Epoch', 2, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 3, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 4, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 5, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 6, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 7, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 8, 'completed out of', 10, 'loss:', 22.873702049255371)
('Epoch', 9, 'completed out of', 10, 'loss:', 22.873702049255371)
('Accuracy:', 0.99627328)

I have also seen accuracy value of 0.0 -_-

---------------EDIT---------------

Some details about data and data processing. I am using daily stock data for IBM from Yahoo! finance for a 20 year(almost) period. This amounts to roughly 5200 lines of entries.

Here is how I am processing it:

import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
import csv
import pickle

def create_feature_sets_and_labels(test_size = 0.2):
    df = pd.read_csv("ibm.csv")
    df = df.iloc[::-1]
    features = df.values
    testing_size = int(test_size*len(features))
    train_x = list(features[1:,1:6][:-testing_size])
    train_y = list(features[1:,7][:-testing_size])
    test_x = list(features[1:,1:6][-testing_size:])
    test_y = list(features[1:,7][-testing_size:])
    scaler = MinMaxScaler(feature_range=(-5,5))
    train_x = scaler.fit_transform(train_x)
    train_y = scaler.fit_transform(train_y)
    test_x = scaler.fit_transform(test_x)
    test_y = scaler.fit_transform(test_y)

    return train_x, train_y, test_x, test_y

if __name__ == "__main__":
    train_x, train_y, test_x, test_y = create_feature_sets_and_labels()
    with open('stockdata.pickle', 'wb') as f:
        pickle.dump([train_x, train_y, test_x, test_y], f)

column 0 is date. So that is not used as a feature. Nor is column 7. I normalized the data using sklearn's MinMaxScaler() over a range of -5 to 5.

-------------EDIT 2-------------------

I've noticed that the system doesn't change its accuracy when data is presented in non-normalized form.

解决方案

Once you pre-process your data into the wrong shape or range in a ML training task, the rest of the data flow will go wrong. You do this multiple times in different ways in the code in the question.

Taking things in order that the processing occurs. The first problems are with pre-processing. Your goals here should be:

  • X values (input features) in tabular form, each row is an example, each column is a feature. Values should be numeric and scaled for use with neural network. Test and train data need to be scaled identically - that doesn't mean using same .fit_transform because that re-fits the scaler.

  • Y values (output labels) in tabular form, each row is example matching the same row of X, each column is the true value of an output. For classification problems the values are typically 0 and 1, and should not be re-scaled since they represent class membership.

This re-write of your create_feature_sets_and_labels function does things correctly:

def create_feature_sets_and_labels(test_size = 0.2):
    df = pd.read_csv("ibm.csv")
    df = df.iloc[::-1]
    features = df.values
    testing_size = int(test_size*len(features))

    train_x = np.array(features[1:,1:6][:-testing_size]).astype(np.float32)
    train_y = np.array(features[1:,7][:-testing_size]).reshape(-1, 1).astype(np.float32)

    test_x = np.array(features[1:,1:6][-testing_size:]).astype(np.float32)
    test_y = np.array(features[1:,7][-testing_size:]).reshape(-1, 1).astype(np.float32)

    scaler = MinMaxScaler(feature_range=(-5,5))

    scaler.fit(train_x)

    train_x = scaler.transform(train_x)
    test_x = scaler.transform(test_x)

    return train_x, train_y, test_x, test_y

Important differences from your version:

  • Using typecast np.array, not list (minor difference)

  • y values are tabular [n_examples, n_outputs] (major difference, your row vector shape is cause of many problems later)

  • Scaler is fit once then applied to features (major difference, if you scale train and test data separately, you are not predicting anything meaningful)

  • Scaler is not applied to outputs (major difference for classifier, you want the train and test values to be 0,1 for meaningful training and reporting accuracy)

There are also some problems with your training code for this data:

  • y = tf.placeholder('float') should be y = tf.placeholder('float', [None, 1]). This makes no difference to processing, but correctly throws an error when y is the wrong shape. That error would have been a clue much earlier that things were going wrong.

  • n_nodes_hl1 = 500 and n_nodes_hl2 = 500 can be much lower, and the network will actually work much better with e.g. n_nodes_hl1 = 10 and n_nodes_hl2 = 10 - this is mainly because of you using large initial values for weights, you could alternatively scale the weights down, and for more complex data you might want to do that instead. In this case it is simpler to reduce number of hidden neurons.

  • As we discussed in comments, the start of your train_neural_network function should look like this:

    output = neural_network_model(x)
    prediction = tf.sigmoid(output)
    cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(output, y))
    optimizer = tf.train.AdamOptimizer().minimize(cost)
    

    . . . this is a major difference. By using sigmoid_cross_entropy_with_logits you have committed to using the pre-transform value of the output layer for training. But you still want the predicted values to measure accuracy (or for any other use of the network where you want to read off a predicted value).

  • For consistent measure of loss, you want to have mean loss per example, so you need to divide you sum of mean-per-batch by the number of batches: 'loss:', epoch_loss/(len(train_x)/batch_size)

If I make all those corrections, and run this with a few more epochs - e.g. 50, then I get a typical loss of 0.7 and accuracy measure of 0.5 - and this occurs reasonably reliably, but does move a little due to changes in starting weights. The accuracy is not very stable, and possibly suffers from over-fit, which you are not allowing for at all (and you should read up on techniques to help measure and manage over-fit, it is an important part of training NNs reliably)

The value of 0.5 may seem bad. It is possible to improve upon it, by modifying network architecture or meta-params. I can get down to 0.43 training loss and up to 0.83 test accuracy for example by swapping tf.nn.relu for tf.tanh in the hidden layers and running for 500 epochs.

To understand more about neural networks, what to measure when training and what might be worth changing in your model, you will want to study the subject in more depth.

这篇关于使用 Tensorflow 在二元分类中改变精度值和损失值不变的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆