在直接比较中,为什么tensorflow的准确性比keras差? [英] Why is tensorflow having a worse accuracy than keras in direct comparison?

查看:134
本文介绍了在直接比较中,为什么tensorflow的准确性比keras差?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我直接比较了具有相同参数和相同数据集(MNIST)的TensorFlow与Keras.

I made a direct comparison between TensorFlow vs Keras with the same parameters and the same dataset (MNIST).

奇怪的是Keras在10个周期内实现96%的性能,而TensorFlow在10个周期内实现约70%的性能.我在同一实例中多次运行了此代码,并且总是会出现这种不一致的情况.

The strange thing is that Keras achieves 96% performance in 10 epochs, while TensorFlow achieves about 70% performance in 10 epochs. I have run this code many times in the same instance and this inconsistency always occurs.

即使为TensorFlow设置了50个纪元,最终性能也达到了90%.

Even setting 50 epochs for TensorFlow, the final performance reaches 90%.

代码:

import keras
from keras.datasets import mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

# One hot encoding
from keras.utils import np_utils
y_train = np_utils.to_categorical(y_train) 
y_test = np_utils.to_categorical(y_test) 

# Changing the shape of input images and normalizing
x_train = x_train.reshape((60000, 784))
x_test = x_test.reshape((10000, 784))
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation

# Creating the neural network
model = Sequential()
model.add(Dense(30, input_dim=784, kernel_initializer='normal', activation='relu'))
model.add(Dense(30, kernel_initializer='normal', activation='relu'))
model.add(Dense(10, kernel_initializer='normal', activation='softmax'))

# Optimizer
optimizer = keras.optimizers.Adam()

# Loss function
model.compile(loss='categorical_crossentropy', optimizer=optimizer, metrics=['acc'])

# Training
model.fit(x_train, y_train, epochs=10, batch_size=200, validation_data=(x_test, y_test), verbose=1)

# Checking the final accuracy
accuracy_final = model.evaluate(x_test, y_test, verbose=0)
print('Model Accuracy: ', accuracy_final)

TensorFlow代码:(x_train,x_test,y_train,y_test与上面的Keras代码的输入相同)

TensorFlow code: (x_train, x_test, y_train, y_test are the same as the input for the Keras code above)

import tensorflow as tf
# Epochs parameters
epochs = 10
batch_size = 200

# Neural network parameters
n_input = 784 
n_hidden_1 = 30 
n_hidden_2 = 30 
n_classes = 10 

# Placeholders x, y
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])

# Creating the first layer
w1 = tf.Variable(tf.random_normal([n_input, n_hidden_1]))
b1 = tf.Variable(tf.random_normal([n_hidden_1]))
layer_1 = tf.nn.relu(tf.add(tf.matmul(x,w1),b1)) 

# Creating the second layer 
w2 = tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2]))
b2 = tf.Variable(tf.random_normal([n_hidden_2]))
layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1,w2),b2)) 

# Creating the output layer 
w_out = tf.Variable(tf.random_normal([n_hidden_2, n_classes]))
bias_out = tf.Variable(tf.random_normal([n_classes]))
output = tf.matmul(layer_2, w_out) + bias_out

# Loss function
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits = output, labels = y))
# Optimizer
optimizer = tf.train.AdamOptimizer().minimize(cost)

# Making predictions
predictions = tf.equal(tf.argmax(output, 1), tf.argmax(y, 1))

# Accuracy
accuracy = tf.reduce_mean(tf.cast(predictions, tf.float32))

# Variables that will be used in the training cycle
train_size = x_train.shape[0]
total_batches = train_size / batch_size

# Initializing the variables
init = tf.global_variables_initializer()

# Opening the session
with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch in range(epochs):

        # Loop through all batch iterations
        for i in range(0, train_size, batch_size): 
            batch_x = x_train[i:i + batch_size]
            batch_y = y_train[i:i + batch_size]

            # Fit training
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})

        # Running accuracy (with test data) on each epoch    
        acc_val = sess.run(accuracy, feed_dict={x: x_test, y: y_test})
        # Showing results after each epoch
        print ("Epoch: ", "{}".format((epoch + 1)))
        print ("Accuracy_val = ", "{:.3f}".format(acc_val))

    print ("Training Completed!")

    # Checking the final accuracy
    checking = tf.equal(tf.argmax(output, 1), tf.argmax(y, 1))
    accuracy_final = tf.reduce_mean(tf.cast(checking, tf.float32))  
    print ("Model Accuracy:", accuracy_final.eval({x: x_test, y: y_test}))

我正在同一实例中运行所有程序.谁能解释这种矛盾?

I'm running everything in the same instance. Can anyone explain this inconsistency?

推荐答案

我认为是罪魁祸首是初始化.例如,一个真正的区别是,使用random_normal初始化TF中的偏差不是最佳实践,而实际上Keras默认将偏差初始化为零,这是最佳实践.您不会重写此设置,因为您仅在Keras代码中设置了kernel_initializer,而没有设置bias_initializer.

I think it's the initialization that's the culprit. For example, one real difference is that you initialize bias in TF with random_normal which isn't the best practice, and in fact Keras defaults to initializing the bias to zero, which is the best practice. You don't override this, since you only set kernel_initializer, but not bias_initializer in your Keras code.

此外,对于重量初始值设定项来说情况更糟.您正在将RandomNormal用于Keras,定义如下:

Furthermore, things are worse for the weight initializers. You are using RandomNormal for Keras, defined like so:

keras.initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None)

但是在TF中,您使用tf.random.normal:

But in TF you use tf.random.normal:

tf.random.normal(shape, mean=0.0, stddev=1.0, dtype=tf.dtypes.float32,    seed=None, name=None)

我可以告诉您,使用0.05的标准偏差进行初始化是合理的,但使用1.0的标准偏差是不合理的.

I can tell you that using standard deviation of 0.05 is reasonable for initialization, but using 1.0 is not.

我怀疑如果您更改了这些参数,情况看起来会更好.但是,如果不这样做,我建议转储两个模型的TensorFlow图,然后手工检查以查看差异.在这种情况下,图形足够小,可以仔细检查.

I suspect that if you changed these parameters, things would look better. But if they don't, I'd suggest dumping the TensorFlow graph for both models and just checking by hand to see the differences. The graphs are small enough in this case to double-check.

从某种程度上讲,这凸显了Keras和TF之间的哲学差异. Keras努力为NN训练设置良好的默认值,以使其与已知的效果相对应.但是TensorFlow是完全不可知的-您必须了解这些做法并明确地对其进行编码.标准差是一个出色的例子:当然,在数学函数中默认值应为1,但如果您知道它,则0.05是一个很好的值将用于初始化NN层.

To some extent this highlights the difference in philosophy between Keras and TF. Keras tries hard to set good defaults for NN training that correspond to what is known to work. But TensorFlow is completely agnostic - you have to know those practices and explicitly code them in. The standard deviation thing is a stellar example: of course it should be 1 by default in a mathematical function, but 0.05 is a good value if you know it will be used to initialize an NN layer.

最初由Dmitriy Genzel在Quora上提供的答案.

这篇关于在直接比较中,为什么tensorflow的准确性比keras差?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆