我将神经网络的示例修改了几层，以查看是否可以.它出什么问题了? [英] I modified a few layers to an example of a neural network just to see if I could. What's wrong with it?

查看：60 发布时间：2021/5/31 18:45:08 machine-learning neural-network pytorch

本文介绍了我将神经网络的示例修改了几层，以查看是否可以.它出什么问题了?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我发现一个简单的神经网络具有w1，Relu和w2层.我尝试在中间添加新的权重图层，然后在其后添加第二个Relu.因此，这些层如下所示:w1，Relu，w_mid，Relu和w2.
如果可以运行的话，它比原始的3层网络要慢得多.我不确定是否所有东西都可以向前通过，并且后向支撑是否可以在每个部件上正常工作.
神经网络来自此链接.这是页面下方的第三段代码.

A simple neural network I found had the layers w1, Relu, and w2. I tried to add a new weight layer in the middle and a second Relu after it. So, the layers are as follows w1, Relu, w_mid, Relu, and w2.
It is much much slower than the original 3 layer network if it works at all. I'm not sure if everything is getting a forward pass and if back prop is working across every part it is supposed to.
The neural network is from this link. It is the third block of code down the page.

这是我更改的代码.
在它的下面是原图.

This is the code I changed.
Below it is the original.

    import torch
    dtype = torch.float
    device = torch.device("cpu")
    #device = torch.device("cuda:0") # Uncomment this to run on GPU

    # N is batch size; D_in is input dimension;
    # H is hidden dimension; D_out is output dimension.
    N, D_in, H, D_out = 64, 250, 250, 10

    # Create random input and output data
    x = torch.randn(N, D_in, device=device, dtype=dtype)
    y = torch.randn(N, D_out, device=device, dtype=dtype)

    # Randomly initialize weights
    w1 = torch.randn(D_in, H, device=device, dtype=dtype)
    w_mid = torch.randn(H, H, device=device, dtype=dtype)
    w2 = torch.randn(H, D_out, device=device, dtype=dtype)

    learning_rate = 1e-5
    for t in range(5000):
        # Forward pass: compute predicted y
        h = x.mm(w1)
        h_relu = h.clamp(min=0)
        k = h_relu.mm(w_mid)
        k_relu = k.clamp(min=0)
        y_pred = k_relu.mm(w2)


        # Compute and print loss
        loss = (y_pred - y).pow(2).sum().item()
        if t % 1000 == 0:
            print(t, loss)

        # Backprop to compute gradients of w1, mid, and w2 with respect to loss
        grad_y_pred = (y_pred - y) * 2
        grad_w2 = k_relu.t().mm(grad_y_pred)
        grad_k_relu = grad_y_pred.mm(w2.t())
        grad_k = grad_k_relu.clone()
        grad_k[k < 0] = 0
        grad_mid = h_relu.t().mm(grad_k)
        grad_h_relu = grad_k.mm(w1.t())
        grad_h = grad_h_relu.clone()
        grad_h[h < 0] = 0
        grad_w1 = x.t().mm(grad_h)

        # Update weights
        w1 -= learning_rate * grad_w1
        w_mid -= learning_rate * grad_mid
        w2 -= learning_rate * grad_w2

损失是..
0 1904074240.0
1000 639.4848022460938
2000 639.4848022460938
3000 639.4848022460938
4000 639.4848022460938

The loss is ..
0 1904074240.0
1000 639.4848022460938
2000 639.4848022460938
3000 639.4848022460938
4000 639.4848022460938

这是Pytorch网站上的原始代码.

This is the original code from the Pytorch website.

    import torch


    dtype = torch.float
    #device = torch.device("cpu")
    device = torch.device("cuda:0") # Uncomment this to run on GPU

    # N is batch size; D_in is input dimension;
    # H is hidden dimension; D_out is output dimension.
    N, D_in, H, D_out = 64, 1000, 100, 10

    # Create random input and output data
    x = torch.randn(N, D_in, device=device, dtype=dtype)
    y = torch.randn(N, D_out, device=device, dtype=dtype)

    # Randomly initialize weights
    w1 = torch.randn(D_in, H, device=device, dtype=dtype)
    w2 = torch.randn(H, D_out, device=device, dtype=dtype)

    learning_rate = 1e-6
    for t in range(500):
        # Forward pass: compute predicted y
        h = x.mm(w1)
        h_relu = h.clamp(min=0)
        y_pred = h_relu.mm(w2)

        # Compute and print loss
        loss = (y_pred - y).pow(2).sum().item()
        if t % 100 == 99:
            print(t, loss)

        # Backprop to compute gradients of w1 and w2 with respect to loss
        grad_y_pred = 2.0 * (y_pred - y)
        grad_w2 = h_relu.t().mm(grad_y_pred)
        grad_h_relu = grad_y_pred.mm(w2.t())
        grad_h = grad_h_relu.clone()
        grad_h[h < 0] = 0
        grad_w1 = x.t().mm(grad_h)

        # Update weights using gradient descent
        w1 -= learning_rate * grad_w1
        w2 -= learning_rate * grad_w2

我将神经网络的示例修改了几层，以查看是否可以.它出什么问题了? [英] I modified a few layers to an example of a neural network just to see if I could. What's wrong with it?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录关闭

我将神经网络的示例修改了几层，以查看是否可以.它出什么问题了? [英] I modified a few layers to an example of a neural network just to see if I could. What&#39;s wrong with it?

问题描述

推荐答案

相关文章

AI人工智能最新文章

热门教程

热门工具

登录 关闭

我将神经网络的示例修改了几层，以查看是否可以.它出什么问题了? [英] I modified a few layers to an example of a neural network just to see if I could. What's wrong with it?

登录关闭