使用 Tensorflow 2.0 进行逻辑回归? [英] Logistic Regression using Tensorflow 2.0?

查看:35
本文介绍了使用 Tensorflow 2.0 进行逻辑回归?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 TensorFlow 2.0 构建多类逻辑回归,并且我编写了我认为正确的代码,但没有给出好的结果.我的准确率实际上是 0.1%,甚至损失也没有减少.我希望有人能在这里帮助我.

I'm trying to build a multi-class logistic regression using TensorFlow 2.0 and I've wrote the code which I think is correct but it's not giving out good results. My accuracy is literally 0.1% and even loss is not decreasing. I was hoping someone could help me out here.

这是我目前编写的代码.请指出我在这里做错了什么,我需要改进以便我的模型正常工作.谢谢!

This is the code I've written so far. Please points out what am I doing wrong here that I need to improve so the my model works. Thanks you!

from tensorflow.keras.datasets import fashion_mnist
from sklearn.model_selection import train_test_split
import tensorflow as tf

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train/255., x_test/255.

x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.15)
x_train = tf.reshape(x_train, shape=(-1, 784))
x_test  = tf.reshape(x_test, shape=(-1, 784))

weights = tf.Variable(tf.random.normal(shape=(784, 10), dtype=tf.float64))
biases  = tf.Variable(tf.random.normal(shape=(10,), dtype=tf.float64))

def logistic_regression(x):
    lr = tf.add(tf.matmul(x, weights), biases)
    return tf.nn.sigmoid(lr)

def cross_entropy(y_true, y_pred):
    y_true = tf.one_hot(y_true, 10)
    loss = tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=y_pred)
    return tf.reduce_mean(loss)

def accuracy(y_true, y_pred):
    y_true = tf.cast(y_true, dtype=tf.int32)
    preds = tf.cast(tf.argmax(y_pred, axis=1), dtype=tf.int32)
    preds = tf.equal(y_true, preds)
    return tf.reduce_mean(tf.cast(preds, dtype=tf.float32))

def grad(x, y):
    with tf.GradientTape() as tape:
        y_pred = logistic_regression(x)
        loss_val = cross_entropy(y, y_pred)
    return tape.gradient(loss_val, [weights, biases])

epochs = 1000
learning_rate = 0.01
batch_size = 128

dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
dataset = dataset.repeat().shuffle(x_train.shape[0]).batch(batch_size)

optimizer = tf.optimizers.SGD(learning_rate)

for epoch, (batch_xs, batch_ys) in enumerate(dataset.take(epochs), 1):
    gradients = grad(batch_xs, batch_ys)
    optimizer.apply_gradients(zip(gradients, [weights, biases]))

    y_pred = logistic_regression(batch_xs)
    loss = cross_entropy(batch_ys, y_pred)
    acc = accuracy(batch_ys, y_pred)
    print("step: %i, loss: %f, accuracy: %f" % (epoch, loss, acc))

    step: 1000, loss: 2.458979, accuracy: 0.101562

推荐答案

模型没有收敛,问题好像是你在做一个 sigmoid 激活直接跟在 tf.nn.softmax_cross_entropy_with_logits.在 tf.nn.softmax_cross_entropy_with_logits 的文档中,它说:

The model is not converging, and the problem seems to be that you are doing a sigmoid activation directly followed by tf.nn.softmax_cross_entropy_with_logits. In the documentation for the tf.nn.softmax_cross_entropy_with_logits it says:

警告:此操作需要未缩放的 logits,因为它在内部对 logits 执行 softmax 以提高效率.不要用 softmax 的输出调用这个操作,因为它会产生错误的结果.

WARNING: This op expects unscaled logits, since it performs a softmax on logits internally for efficiency. Do not call this op with the output of softmax, as it will produce incorrect results.

因此,在传递给 tf.nn.softmax_cross_entropy_with_logits 之前,不应在前一层的输出上进行 softmax、sigmoid、relu、tanh 或任何其他激活.有关何时使用 sigmoid 或 softmax 输出激活的更深入描述,请参阅 这里.

Hence no softmax, sigmoid, relu, tanh or any other activations should be done on the output of the previous layer before passed to tf.nn.softmax_cross_entropy_with_logits. For more in depth description of when to use sigmoid or softmax output activation, see here.

因此,通过在 logistic_regression 函数中将 return tf.nn.sigmoid(lr) 替换为 return lr,模型正在收敛.

Therfore by replacing return tf.nn.sigmoid(lr) with just return lr in the logistic_regression function, the model is converging.

以下是具有上述修复程序的代码的工作示例.我还将变量名称 epochs 更改为 n_batches 因为您的训练循环实际上经历了 1000 个批次而不是 1000 个时期(我也将其提高到 10000,因为有更多迭代的迹象需要).

Below is a working example of your code with the above fix. I also changed the variable name epochs to n_batches as your training loop actually goes through 1000 batches not 1000 epochs (i also bumped it up to 10000 as there was sign of more iterations needed).

from tensorflow.keras.datasets import fashion_mnist
from sklearn.model_selection import train_test_split
import tensorflow as tf

(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
x_train, x_test = x_train/255., x_test/255.

x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=0.15)
x_train = tf.reshape(x_train, shape=(-1, 784))
x_test  = tf.reshape(x_test, shape=(-1, 784))

weights = tf.Variable(tf.random.normal(shape=(784, 10), dtype=tf.float64))
biases  = tf.Variable(tf.random.normal(shape=(10,), dtype=tf.float64))

def logistic_regression(x):
    lr = tf.add(tf.matmul(x, weights), biases)
    #return tf.nn.sigmoid(lr)
    return lr


def cross_entropy(y_true, y_pred):
    y_true = tf.one_hot(y_true, 10)
    loss = tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=y_pred)
    return tf.reduce_mean(loss)

def accuracy(y_true, y_pred):
    y_true = tf.cast(y_true, dtype=tf.int32)
    preds = tf.cast(tf.argmax(y_pred, axis=1), dtype=tf.int32)
    preds = tf.equal(y_true, preds)
    return tf.reduce_mean(tf.cast(preds, dtype=tf.float32))

def grad(x, y):
    with tf.GradientTape() as tape:
        y_pred = logistic_regression(x)
        loss_val = cross_entropy(y, y_pred)
    return tape.gradient(loss_val, [weights, biases])

n_batches = 10000
learning_rate = 0.01
batch_size = 128

dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
dataset = dataset.repeat().shuffle(x_train.shape[0]).batch(batch_size)

optimizer = tf.optimizers.SGD(learning_rate)

for batch_numb, (batch_xs, batch_ys) in enumerate(dataset.take(n_batches), 1):
    gradients = grad(batch_xs, batch_ys)
    optimizer.apply_gradients(zip(gradients, [weights, biases]))

    y_pred = logistic_regression(batch_xs)
    loss = cross_entropy(batch_ys, y_pred)
    acc = accuracy(batch_ys, y_pred)
    print("Batch number: %i, loss: %f, accuracy: %f" % (batch_numb, loss, acc))

(removed printouts)
>> Batch number: 1000, loss: 2.868473, accuracy: 0.546875
(removed printouts)
>> Batch number: 10000, loss: 1.482554, accuracy: 0.718750

这篇关于使用 Tensorflow 2.0 进行逻辑回归?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆