为什么简单的2层神经网络无法学习0,0序列? [英] Why is a simple 2-layer Neural Network unable to learn 0,0 sequence?

查看:116
本文介绍了为什么简单的2层神经网络无法学习0,0序列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在经历一个小巧的示例时2层神经网络,我注意到无法解释的结果.

While going through the example of a tiny 2-layer neural network I noticed the result that I cannot explain.

假设我们有以下带有相应标签的数据集:

Imagine we have the following dataset with the corresponding labels:

[0,1] -> [0]
[0,1] -> [0]
[1,0] -> [1]
[1,0] -> [1]

让我们创建一个微型的2层NN,它将学习预测两个数字序列的结果,其中每个数字可以为0或1.我们将根据上述数据集训练该NN.

Let's create a tiny 2-layer NN which will learn to predict the outcome of a two number sequence where each number can be 0 or 1. We shall train this NN given our dataset mentioned above.

    import numpy as np

    # compute sigmoid nonlinearity
    def sigmoid(x):
        output = 1 / (1 + np.exp(-x))
        return output

    # convert output of sigmoid function to its derivative
    def sigmoid_to_deriv(output):
        return output * (1 - output)

    def predict(inp, weigths):
        print inp, sigmoid(np.dot(inp, weigths))

    # input dataset
    X = np.array([ [0,1],
                   [0,1],
                   [1,0],
                   [1,0]])
    # output dataset
    Y = np.array([[0,0,1,1]]).T

    np.random.seed(1)

    # init weights randomly with mean 0
    weights0 = 2 * np.random.random((2,1)) - 1

    for i in xrange(10000):
        # forward propagation
        layer0 = X
        layer1 = sigmoid(np.dot(layer0, weights0))
        # compute the error
        layer1_error = layer1 - Y

        # gradient descent
        # calculate the slope at current x position
        layer1_delta = layer1_error * sigmoid_to_deriv(layer1)
        weights0_deriv = np.dot(layer0.T, layer1_delta)
        # change x by the negative of the slope (x = x - slope)
        weights0 -= weights0_deriv

    print 'INPUT   PREDICTION'
    predict([0,1], weights0)
    predict([1,0], weights0)
    # test prediction of the unknown data
    predict([1,1], weights0)
    predict([0,0], weights0)

我们训练了此NN后,对其进行了测试.

After we've trained this NN we test it.

INPUT   PREDICTION
[0, 1] [ 0.00881315]
[1, 0] [ 0.99990851]
[1, 1] [ 0.5]
[0, 0] [ 0.5]

好,0,11,0是我们所期望的. 0,01,1的预测也是可以解释的,我们的NN没有这些情况的训练数据,因此让我们将其添加到训练数据集中:

Ok, 0,1 and 1,0 is what we would expect. The predictions for 0,0 and 1,1 are also explainable, our NN just didn't have the training data for these cases, so let's add it into our training dataset:

[0,1] -> [0]
[0,1] -> [0]
[1,0] -> [1]
[1,0] -> [1]
[0,0] -> [0]
[1,1] -> [1]

重新训练网络并再次测试!

Retrain the network and test it again!

INPUT   PREDICTION
[0, 1] [ 0.00881315]
[1, 0] [ 0.99990851]
[1, 1] [ 0.9898148]
[0, 0] [ 0.5]

  • 等等,为什么 [0,0] 仍然是 0.5 ?
    • Wait, why [0,0] is still 0.5?
    • 这意味着NN对0,0仍然仍然不确定,与对1,1不确定直到我们训练它一样.

      This means that NN is still uncertain about 0,0, same when it was uncertain about 1,1 until we trained it.

      推荐答案

      分类也是正确的.您需要了解网络能够分离测试集.

      The classification is right as well. You need to understand that the net was able to separate the test set.

      现在,您需要使用步进功能对01之间的数据进行分类.

      Now You need to use an step function to classify the data between 0 or 1.

      在您的情况下,0.5似乎是很好的threshold

      In your case the 0.5 seems to be a good threshold

      您需要在代码中添加偏见.

      You need to add the bias to the code.

      # input dataset
      X = np.array([ [0,0,1],
                     [0,0,1],
                     [0,1,0],
                     [0,1,0]])
      
      # init weights randomly with mean 0
      weights0 = 2 * np.random.random((3,1)) - 1
      

      这篇关于为什么简单的2层神经网络无法学习0,0序列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆