使用TensorFlow和Keras的不同训练结果 [英] Different training result using tensorflow and keras

查看:557
本文介绍了使用TensorFlow和Keras的不同训练结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我随机创建形状为(1000,10)的训练数据X.对于标签Y,它始终等于X功能的第一个元素.

I randomly create training data X in a shape of (1000,10). for the label Y, it always equal the first element of the X feature.

例如假设x1 = [0.1,0.2,0.3...,0.9],然后是y = 0.1.使用以下代码创建的数据集:

eg. suppose x1 = [0.1,0.2,0.3...,0.9], theny = 0.1. The dataset created using the following code:

from numpy.random import RandomState
rdm=RandomState(1)
data_size=10000
xdim=10
X=rdm.rand(data_size,xdim)
Y = [x1[0] for x1 in X]

我试图创建一个仅包含一个节点的一层神经网络来学习此映射,并且我认为期望权重应为[1,0,0,0,0,0,0,0,0,0]且偏差应为0以便仅提取x目的的第一个元素.

I tried to create an one layer with only one node neural network to learn this mapping, and I thought the expecting Weights should be [1,0,0,0,0,0,0,0,0,0] and biases should be 0 for extracting only the first element of x purpose.

这是我在tensorflow中实现的代码.培训不是收敛的.

Here is the code I implemented in tensorflow. the training is not convergence.

import tensorflow as tf
x=tf.placeholder(tf.float64,shape=(None,xdim))
y=tf.placeholder(tf.float64,shape=(None))

# for simple reason, using zero to initialize both weights and biases
Weights = tf.Variable(tf.zeros([xdim, 1],dtype=tf.float64))
biases = tf.Variable(tf.zeros([1],dtype=tf.float64))
y_predict = tf.matmul(x, Weights)+biases
loss = tf.losses.mean_squared_error(y_predict,y)  
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

batch_size=100
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    for i in range(10001):
        start = i * batch_size % data_size
        end = min(start + batch_size,data_size)
        sess.run(optimizer,feed_dict={x:X[start:end],y:Y[start:end]})
        if i % 1000 == 0:
            ypred,training_loss= sess.run([y_predict,loss],feed_dict={x:X,y:Y})
            print("Epoch %d: loss=%g"%(i,training_loss))
    print('Weights:\n',sess.run(Weights))
    print('biases:\n',sess.run(biases))

输出为:

Epoch 0: loss=0.299163
Epoch 1000: loss=0.0838915
Epoch 2000: loss=0.0829176
Epoch 3000: loss=0.0825273
Epoch 4000: loss=0.08237
Epoch 5000: loss=0.0823084
Epoch 6000: loss=0.0822847
Epoch 7000: loss=0.0822745
Epoch 8000: loss=0.0822701
Epoch 9000: loss=0.082268
Epoch 10000: loss=0.0822669
Weights:
 [[ 0.01159591]
 [ 0.0003244 ]
 [ 0.00319655]
 [ 0.00113588]
 [-0.00079908]
 [-0.00086694]
 [ 0.00020551]
 [-0.00243378]
 [-0.00260724]
 [ 0.00052958]]
biases:
 [ 0.48771921]

Keras

import keras
from keras.models import Sequential
from keras.layers import Dense,Input
import numpy as np

model = Sequential()
model.add(Dense(units=1,input_dim=xdim,kernel_initializer='zeros',bias_initializer='zeros')) 
model.compile(loss='mse', optimizer=keras.optimizers.SGD(lr=0.01))


batch_size=100
for i in range(10001):
    start = i * batch_size % data_size
    end = min(start + batch_size,data_size)
    cost = model.train_on_batch(X[start:end], np.array(Y[start:end]))
    if i % 1000 == 0:
        print("Epoch %d: loss=%g"%(i,cost))
print('Weights:\n',model.get_weights()[0])
print('biases:\n',model.get_weights()[1])

输出:

Using TensorFlow backend.
Epoch 0: loss=0.284947
Epoch 1000: loss=0.00321839
Epoch 2000: loss=0.000247763
Epoch 3000: loss=5.40826e-05
Epoch 4000: loss=1.90453e-05
Epoch 5000: loss=7.40253e-06
Epoch 6000: loss=2.93623e-06
Epoch 7000: loss=1.17069e-06
Epoch 8000: loss=4.67434e-07
Epoch 9000: loss=1.86726e-07
Epoch 10000: loss=7.45764e-08
Weights:
 [[  9.99678493e-01]
 [ -3.00021959e-04]
 [ -2.89586897e-04]
 [ -2.90223019e-04]
 [ -2.83820234e-04]
 [ -2.82248948e-04]
 [ -2.96013983e-04]
 [ -3.13797180e-04]
 [ -3.20409046e-04]
 [ -3.11669020e-04]]
biases:
 [ 0.00153964]

问题

看来Keras可以得到正确的结果.但是我使用了相同过程,包括权重和偏差的初始化,损失函数和具有相同学习率的优化器.我不明白为什么会这样,我的代码中是否有任何问题/错误?

Question

It seems like Keras can get the correct result. But I used the same process including initialisation of weights and biases, loss function and optimiser with the same learning rate. I could not understand why this happen and is there any problem/errors in my codes?

推荐答案

您应该交换在TensorFlow实现中的tf.losses.mean_squared_error :

loss = tf.losses.mean_squared_error(y, y_predict) 

此外,yy_predict的形状分别为(batch_size,)(batch_size, 1).为了避免不必要的隐式广播,您应该在指定损失函数之前先压缩y_predict:

In addition, the shapes of y and y_predict are (batch_size,) and (batch_size, 1), respectively. You should squeeze y_predict prior to specifying the loss function, in order to avoid unwanted implicit broadcasting:

y_predict = tf.matmul(x, Weights)+biases
y_predict = tf.squeeze(y_predict)
loss = tf.losses.mean_squared_error(y,y_predict)

这篇关于使用TensorFlow和Keras的不同训练结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆