在Tensorflow中使用神经网络实现XOR门的问题 [英] Problems implementing an XOR gate with Neural Nets in Tensorflow

查看:85
本文介绍了在Tensorflow中使用神经网络实现XOR门的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想做一个琐碎的神经网络,它应该实现XOR门.我在python中使用TensorFlow库. 对于XOR门,我训练的唯一数据是完整的真值表,应该足够了吗?过度优化是我期望很快发生的事情.该代码的问题是 weights biases 不会更新.不知何故,它仍然使我的偏差和权重为100%且零误差.

I want to make a trivial neural network, it should just implement the XOR gate. I am using the TensorFlow library, in python. For an XOR gate, the only data I train with, is the complete truth table, that should be enough right? Over optimization is what I will expect to happen very quickly. Problem with the code is that the weights and biases do not update. Somehow it still gives me 100% accuracy with zero for the biases and weights.

x = tf.placeholder("float", [None, 2])
W = tf.Variable(tf.zeros([2,2]))
b = tf.Variable(tf.zeros([2]))

y = tf.nn.softmax(tf.matmul(x,W) + b)

y_ = tf.placeholder("float", [None,1])


print "Done init"

cross_entropy = -tf.reduce_sum(y_*tf.log(y))
train_step = tf.train.GradientDescentOptimizer(0.75).minimize(cross_entropy)

print "Done loading vars"

init = tf.initialize_all_variables()
print "Done: Initializing variables"

sess = tf.Session()
sess.run(init)
print "Done: Session started"

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1], [0], [0], [0]])


acc=0.0
while acc<0.85:
  for i in range(500):
      sess.run(train_step, feed_dict={x: xTrain, y_: yTrain})


  print b.eval(sess)
  print W.eval(sess)


  print "Done training"


  correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

  accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

  print "Result:"
  acc= sess.run(accuracy, feed_dict={x: xTrain, y_: yTrain})
  print acc

B0 = b.eval(sess)[0]
B1 = b.eval(sess)[1]
W00 = W.eval(sess)[0][0]
W01 = W.eval(sess)[0][1]
W10 = W.eval(sess)[1][0]
W11 = W.eval(sess)[1][1]

for A,B in product([0,1],[0,1]):
  top = W00*A + W01*A + B0
  bottom = W10*B + W11*B + B1
  print "A:",A," B:",B
  # print "Top",top," Bottom: ", bottom
  print "Sum:",top+bottom

我正在从 http://tensorflow.org/tutorials/访问该教程mnist/beginners/index.md#softmax_regressions 在最后一个for循环中,我将从矩阵中打印结果(如链接中所述).

I am following the tutorial from http://tensorflow.org/tutorials/mnist/beginners/index.md#softmax_regressions and in the final for-loop I am printing the results form the matrix(as described in the link).

有人可以指出我的错误以及我应该怎么做吗?

Can anybody point out my error and what I should do to fix it?

推荐答案

您的程序存在一些问题.

There are a few issues with your program.

第一个问题是,您正在学习的函数不是XOR,而是NOR.行:

The first issue is that the function you're learning isn't XOR - it's NOR. The lines:

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1], [0], [0], [0]])

...应该是:

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[0], [1], [1], [0]])

下一个大问题是,您设计的网络学习XOR .您将需要使用非线性函数(例如tf.nn.relu(),并至少定义一层以上的内容以学习XOR函数.例如:

The next big issue is that the network you've designed isn't capable of learning XOR. You'll need to use a non-linear function (such as tf.nn.relu() and define at least one more layer to learn the XOR function. For example:

x = tf.placeholder("float", [None, 2])
W_hidden = tf.Variable(...)
b_hidden = tf.Variable(...)
hidden = tf.nn.relu(tf.matmul(x, W_hidden) + b_hidden)

W_logits = tf.Variable(...)
b_logits = tf.Variable(...)
logits = tf.matmul(hidden, W_logits) + b_logits

另一个问题是,将权重初始化为零会

A further issue is that initializing the weights to zero will prevent your network from training. Typically, you should initialize your weights randomly, and your biases to zero. Here's one popular way to do it:

HIDDEN_NODES = 2

W_hidden = tf.Variable(tf.truncated_normal([2, HIDDEN_NODES], stddev=1./math.sqrt(2)))
b_hidden = tf.Variable(tf.zeros([HIDDEN_NODES]))

W_logits = tf.Variable(tf.truncated_normal([HIDDEN_NODES, 2], stddev=1./math.sqrt(HIDDEN_NODES)))
b_logits = tf.Variable(tf.zeros([2]))

将所有内容放在一起,并使用TensorFlow例程进行交叉熵(为方便起见,使用单热编码yTrain编码),这是一个学习XOR的程序:

Putting it all together, and using TensorFlow routines for cross-entropy (with a one-hot encoding of yTrain for convenience), here's a program that learns XOR:

import math
import tensorflow as tf
import numpy as np

HIDDEN_NODES = 10

x = tf.placeholder(tf.float32, [None, 2])
W_hidden = tf.Variable(tf.truncated_normal([2, HIDDEN_NODES], stddev=1./math.sqrt(2)))
b_hidden = tf.Variable(tf.zeros([HIDDEN_NODES]))
hidden = tf.nn.relu(tf.matmul(x, W_hidden) + b_hidden)

W_logits = tf.Variable(tf.truncated_normal([HIDDEN_NODES, 2], stddev=1./math.sqrt(HIDDEN_NODES)))
b_logits = tf.Variable(tf.zeros([2]))
logits = tf.matmul(hidden, W_logits) + b_logits

y = tf.nn.softmax(logits)

y_input = tf.placeholder(tf.float32, [None, 2])

cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, y_input)
loss = tf.reduce_mean(cross_entropy)

train_op = tf.train.GradientDescentOptimizer(0.2).minimize(loss)

init_op = tf.initialize_all_variables()

sess = tf.Session()
sess.run(init_op)

xTrain = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
yTrain = np.array([[1, 0], [0, 1], [0, 1], [1, 0]])

for i in xrange(500):
  _, loss_val = sess.run([train_op, loss], feed_dict={x: xTrain, y_input: yTrain})

  if i % 10 == 0:
    print "Step:", i, "Current loss:", loss_val
    for x_input in [[0, 0], [0, 1], [1, 0], [1, 1]]:
      print x_input, sess.run(y, feed_dict={x: [x_input]})

请注意,这可能不是用于计算XOR的最有效的神经网络,因此欢迎提出调整参数的建议!

Note that this is probably not the most efficient neural network for computing XOR, so suggestions for tweaking the parameters are welcome!

这篇关于在Tensorflow中使用神经网络实现XOR门的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆