无法在 Python 中仅用两个隐藏神经元解决 XOR 问题 [英] Unable to solve the XOR problem with just two hidden neurons in Python
问题描述
我有一个小型的 3 层神经网络,其中包含两个输入神经元、两个隐藏神经元和一个输出神经元.我试图坚持只使用 2 个隐藏神经元的以下格式.
我试图展示如何将其用作 XOR 逻辑门,但是只有两个隐藏的神经元在 1,000,000 次迭代后我得到以下糟糕的输出!
输入:0 0 输出:[0.01039096]输入:1 0 输出:[0.93708829]输入:0 1 输出:[0.93599738]输入:1 1 输出:[0.51917667]
如果我使用三个隐藏神经元,我会在 100,000 次迭代后获得更好的输出:
输入:0 0 输出:[0.01831612]输入:1 0 输出:[0.98558057]输入:0 1 输出:[0.98567602]输入:1 1 输出:[0.02007876]
我得到了一个不错的输出,隐藏层中有 3 个神经元,但隐藏层中有两个神经元.为什么?
根据下面的评论,这个
I have a small, 3 layer, neural network with two input neurons, two hidden neurons and one output neuron. I am trying to stick to the below format of using only 2 hidden neurons.
I am trying to show how this can be used to behave as the XOR logic gate, however with just two hidden neurons I get the following poor output after 1,000,000 iterations!
Input: 0 0 Output: [0.01039096]
Input: 1 0 Output: [0.93708829]
Input: 0 1 Output: [0.93599738]
Input: 1 1 Output: [0.51917667]
If I use three hidden neurons I get a much better output with 100,000 iterations:
Input: 0 0 Output: [0.01831612]
Input: 1 0 Output: [0.98558057]
Input: 0 1 Output: [0.98567602]
Input: 1 1 Output: [0.02007876]
I am getting a decent output with 3 neurons in the hidden layer but not with two neurons in the hidden layer. Why?
As per a comment below, this repo contains code of high to solve the XOR problem using two hidden neurons.
I can't figure out what I am doing wrong. Any suggestions are appreciated! Attached is my code:
import numpy as np
import matplotlib
from matplotlib import pyplot as plt
# Sigmoid function
def sigmoid(x, deriv=False):
if deriv:
return x * (1 - x)
return 1 / (1 + np.exp(-x))
alpha = [0.7]
# Input dataset
X = np.array([[0, 0],
[0, 1],
[1, 0],
[1, 1]])
# Output dataset
y = np.array([[0, 1, 1, 0]]).T
# seed random numbers to make calculation deterministic
np.random.seed(1)
# initialise weights randomly with mean 0
syn0 = 2 * np.random.random((2, 3)) - 1 # 1st layer of weights synapse 0 connecting L0 to L1
syn1 = 2 * np.random.random((3, 1)) - 1 # 2nd layer of weights synapse 0 connecting L1 to L2
# Randomize inputs for stochastic gradient descent
data = np.hstack((X, y)) # append Input and output dataset
np.random.shuffle(data) # shuffle
x, y = np.array_split(data, 2, 1) # Split along vertical(1) axis
for iter in range(100000):
for i in range(4):
# forward prop
layer0 = x[i] # Input layer
layer1 = sigmoid(np.dot(layer0, syn0)) # Prediction step for layer 1
layer2 = sigmoid(np.dot(layer1, syn1)) # Prediction step for layer 2
layer2_error = y[i] - layer2 # Compare how well layer2's guess was with input
layer2_delta = layer2_error * sigmoid(layer2, deriv=True) # Error weighted derivative step
if iter % 10000 == 0:
print("Error: ", str(np.mean(np.abs(layer2_error))))
plt.plot(iter, layer2_error, 'ro')
# Uses "confidence weighted error" from l2 to establish an error for l1
layer1_error = layer2_delta.dot(syn1.T)
layer1_delta = layer1_error * sigmoid(layer1, deriv=True) # Error weighted derivative step
# Since SGD we need to dot product two 1D arrays. This is how.
syn1 += (alpha * np.dot(layer1[:, None], layer2_delta[None, :])) # Update weights
syn0 += (alpha * np.dot(layer0[:, None], layer1_delta[None, :]))
# Training was done above, below we re run to test algorithm
layer0 = X # Input layer
layer1 = sigmoid(np.dot(layer0, syn0)) # Prediction step for layer 1
layer2 = sigmoid(np.dot(layer1, syn1)) # Prediction step for layer 2
plt.show()
print("output after training: \n")
print("Input: 0 0 \t Output: ", layer2[0])
print("Input: 1 0 \t Output: ", layer2[1])
print("Input: 0 1 \t Output: ", layer2[2])
print("Input: 1 1 \t Output: ", layer2[3])
This is due to the fact that you have not considered any bias
for the neurons.
You have only used weights to try and fit the XOR
model.
Incase of 2 neurons in the hidden layer, the network under-fits as it can't compensate for the bias.
When you use 3 neurons in the hidden layer, the extra neuron counters the effect caused due to the lack of bias.
This is an example of a network for XOR gate. You'll notice theta
(bias) added to the hidden layers. This gives the network an additional parameter to tweak.
这篇关于无法在 Python 中仅用两个隐藏神经元解决 XOR 问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!