火炬损失值不变 [英] pytorch loss value not change

查看：73 发布时间：2020/10/19 22:27:49 python deep-learning pytorch

本文介绍了火炬损失值不变的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我根据这篇文章编写了一个模块： http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/

I wrote a module based on this article: http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/

这个想法是将输入传递到多个流中，然后连接在一起并连接到FC层。我将源代码分为3个自定义模块： TextClassifyCnnNet >> FlatCnnLayer >> FilterLayer

The idea is pass the input into multiple streams then concat together and connect to a FC layer. I divided my source code into 3 custom modules: TextClassifyCnnNet >> FlatCnnLayer >> FilterLayer

FilterLayer：

FilterLayer:

class FilterLayer(nn.Module):
    def __init__(self, filter_size, embedding_size, sequence_length, out_channels=128):
        super(FilterLayer, self).__init__()

        self.model = nn.Sequential(
            nn.Conv2d(1, out_channels, (filter_size, embedding_size)),
            nn.ReLU(inplace=True),
            nn.MaxPool2d((sequence_length - filter_size + 1, 1), stride=1)
        )

        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))

    def forward(self, x):
        return self.model(x)

FlatCnnLayer：

FlatCnnLayer:

class FlatCnnLayer(nn.Module):
    def __init__(self, embedding_size, sequence_length, filter_sizes=[3, 4, 5], out_channels=128):
        super(FlatCnnLayer, self).__init__()

        self.filter_layers = nn.ModuleList(
            [FilterLayer(filter_size, embedding_size, sequence_length, out_channels=out_channels) for
             filter_size in filter_sizes])

    def forward(self, x):
        pools = []
        for filter_layer in self.filter_layers:
            out_filter = filter_layer(x)
            # reshape from (batch_size, out_channels, h, w) to (batch_size, h, w, out_channels)
            pools.append(out_filter.view(out_filter.size()[0], 1, 1, -1))
        x = torch.cat(pools, dim=3)

        x = x.view(x.size()[0], -1)
        x = F.dropout(x, p=dropout_prob, training=True)

        return x

TextClassifyCnnNet（主模块）：

TextClassifyCnnNet (main module):

class TextClassifyCnnNet(nn.Module):
    def __init__(self, embedding_size, sequence_length, num_classes, filter_sizes=[3, 4, 5], out_channels=128):
        super(TextClassifyCnnNet, self).__init__()

        self.flat_layer = FlatCnnLayer(embedding_size, sequence_length, filter_sizes=filter_sizes,
                                       out_channels=out_channels)

        self.model = nn.Sequential(
            self.flat_layer,
            nn.Linear(out_channels * len(filter_sizes), num_classes)
        )

    def forward(self, x):
        x = self.model(x)

        return x


def fit(net, data, save_path):
    if torch.cuda.is_available():
        net = net.cuda()

    for param in list(net.parameters()):
        print(type(param.data), param.size())

    optimizer = optim.Adam(net.parameters(), lr=0.01, weight_decay=0.1)

    X_train, X_test = data['X_train'], data['X_test']
    Y_train, Y_test = data['Y_train'], data['Y_test']

    X_valid, Y_valid = data['X_valid'], data['Y_valid']

    n_batch = len(X_train) // batch_size

    for epoch in range(1, n_epochs + 1):  # loop over the dataset multiple times
        net.train()
        start = 0
        end = batch_size

        for batch_idx in range(1, n_batch + 1):
            # get the inputs
            x, y = X_train[start:end], Y_train[start:end]
            start = end
            end = start + batch_size

            # zero the parameter gradients
            optimizer.zero_grad()

            # forward + backward + optimize
            predicts = _get_predict(net, x)
            loss = _get_loss(predicts, y)
            loss.backward()
            optimizer.step()

            if batch_idx % display_step == 0:
                print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                    epoch, batch_idx * len(x), len(X_train), 100. * batch_idx / (n_batch + 1), loss.data[0]))

        # print statistics
        if epoch % display_step == 0 or epoch == 1:
            net.eval()
            valid_predicts = _get_predict(net, X_valid)
            valid_loss = _get_loss(valid_predicts, Y_valid)
            valid_accuracy = _get_accuracy(valid_predicts, Y_valid)
            print('\r[%d] loss: %.3f - accuracy: %.2f' % (epoch, valid_loss.data[0], valid_accuracy * 100))

    print('\rFinished Training\n')

    net.eval()

    test_predicts = _get_predict(net, X_test)
    test_loss = _get_loss(test_predicts, Y_test).data[0]
    test_accuracy = _get_accuracy(test_predicts, Y_test)
    print('Test loss: %.3f - Test accuracy: %.2f' % (test_loss, test_accuracy * 100))

    torch.save(net.flat_layer.state_dict(), save_path)


def _get_accuracy(predicts, labels):
    predicts = torch.max(predicts, 1)[1].data[0]
    return np.mean(predicts == labels)


def _get_predict(net, x):
    # wrap them in Variable
    inputs = torch.from_numpy(x).float()
    # convert to cuda tensors if cuda flag is true
    if torch.cuda.is_available:
        inputs = inputs.cuda()
    inputs = Variable(inputs)
    return net(inputs)


def _get_loss(predicts, labels):
    labels = torch.from_numpy(labels).long()
    # convert to cuda tensors if cuda flag is true
    if torch.cuda.is_available:
        labels = labels.cuda()
    labels = Variable(labels)
    return F.cross_entropy(predicts, labels)

似乎参数只是在每个时期稍作更新，在整个过程中精度仍然保持不变。在Tensorflow中使用相同的实现和相同的参数时，它可以正常运行。

It seems that parameters 're just updated slightly each epoch, the accuracy remains for all the process. While with the same implementation and the same params in Tensorflow, it runs correctly.

我是Pytorch的新手，所以也许我的说明有问题，请帮助我找出。谢谢！

I'm new to Pytorch, so maybe my instructions has something wrong, please help me to find out. Thank you!

Ps：我尝试使用 F.nll_loss + F.log_softmax 而不是 F.cross_entropy 。从理论上讲，它应该返回相同的值，但实际上会打印出另一个结果（但仍然是错误的损失值）

P.s: I try to use F.nll_loss + F.log_softmax instead of F.cross_entropy. Theoretically, it should return the same, but in fact another result is printed out (but it still be a wrong loss value)

火炬损失值不变 [英] pytorch loss value not change

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

火炬损失值不变 [英] pytorch loss value not change

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭