在 pytorch 上使用 MC Dropout 测量不确定性 [英] Measuring uncertainty using MC Dropout on pytorch

查看:324
本文介绍了在 pytorch 上使用 MC Dropout 测量不确定性的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 Pytorch 上使用 Mc Dropout 实现贝叶斯 CNN,主要思想是,通过在测试时应用 dropout 并运行多次前向传递,您可以获得来自各种不同模型的预测.我找到了 Mc Dropout 的一个应用,但我真的不明白他们是如何应用这种方法的,以及他们究竟是如何从预测列表中选择正确的预测的

I am trying to implement Bayesian CNN using Mc Dropout on Pytorch, the main idea is that by applying dropout at test time and running over many forward passes , you get predictions from a variety of different models. I’ve found an application of the Mc Dropout and I really did not get how they applied this method and how exactly they did choose the correct prediction from the list of predictions

这是代码


 def mcdropout_test(model):
    model.train()
    test_loss = 0
    correct = 0
    T = 100
    for data, target in test_loader:
        if args.cuda:
            data, target = data.cuda(), target.cuda()
        data, target = Variable(data, volatile=True), Variable(target)
        output_list = []
        for i in xrange(T):
            output_list.append(torch.unsqueeze(model(data), 0))
        output_mean = torch.cat(output_list, 0).mean(0)
        test_loss += F.nll_loss(F.log_softmax(output_mean), target, size_average=False).data[0]  # sum up batch loss
        pred = output_mean.data.max(1, keepdim=True)[1]  # get the index of the max log-probability
        correct += pred.eq(target.data.view_as(pred)).cpu().sum()

    test_loss /= len(test_loader.dataset)
    print('
MC Dropout Test set: Average loss: {:.4f}, Accuracy: {}/{} ({:.2f}%)
'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))


    train()
    mcdropout_test()

我已经换了

data, target = Variable(data, volatile=True), Variable(target)

通过添加

with torch.no_grad(): 在开头

这就是我定义我的 CNN 的方式

And this is how I have defined my CNN

class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 192, 5, padding=2)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(192, 192, 5, padding=2)
        self.fc1 = nn.Linear(192 * 8 * 8, 1024)
        self.fc2 = nn.Linear(1024, 256)
        self.fc3 = nn.Linear(256, 10)
        self.dropout = nn.Dropout(p=0.3)
        
        nn.init.xavier_uniform_(self.conv1.weight)
        nn.init.constant_(self.conv1.bias, 0.0)
        nn.init.xavier_uniform_(self.conv2.weight)
        nn.init.constant_(self.conv2.bias, 0.0)
        nn.init.xavier_uniform_(self.fc1.weight)
        nn.init.constant_(self.fc1.bias, 0.0)
        nn.init.xavier_uniform_(self.fc2.weight)
        nn.init.constant_(self.fc2.bias, 0.0)
        nn.init.xavier_uniform_(self.fc3.weight)
        nn.init.constant_(self.fc3.bias, 0.0)


    def forward(self, x):
        x = self.pool(F.relu(self.dropout(self.conv1(x))))  # recommended to add the relu
        x = self.pool(F.relu(self.dropout(self.conv2(x))))  # recommended to add the relu
        x = x.view(-1, 192 * 8 * 8)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(self.dropout(x)))
        x = self.fc3(self.dropout(x))  # no activation function needed for the last layer
        return x

谁能帮我在 CNN 上正确实现 Monte Carlo Dropout 方法?

Can anyone help me to get the right implementation of the Monte Carlo Dropout method on CNN?

推荐答案

在 Pytorch 中实现 MC Dropout 很容易.所需要做的就是将模型的 dropout 层设置为训练模式.这允许在不同的各种前向传递期间使用不同的 dropout 掩码.下面是 Pytorch 中 MC Dropout 的实现,说明了来自各种前向传递的多个预测如何堆叠在一起并用于计算不同的不确定性指标.

Implementing MC Dropout in Pytorch is easy. All that is needed to be done is to set the dropout layers of your model to train mode. This allows for different dropout masks to be used during the different various forward passes. Below is an implementation of MC Dropout in Pytorch illustrating how multiple predictions from the various forward passes are stacked together and used for computing different uncertainty metrics.

import sys

import numpy as np

import torch
import torch.nn as nn


def enable_dropout(model):
    """ Function to enable the dropout layers during test-time """
    for m in model.modules():
        if m.__class__.__name__.startswith('Dropout'):
            m.train()

def get_monte_carlo_predictions(data_loader,
                                forward_passes,
                                model,
                                n_classes,
                                n_samples):
    """ Function to get the monte-carlo samples and uncertainty estimates
    through multiple forward passes

    Parameters
    ----------
    data_loader : object
        data loader object from the data loader module
    forward_passes : int
        number of monte-carlo samples/forward passes
    model : object
        keras model
    n_classes : int
        number of classes in the dataset
    n_samples : int
        number of samples in the test set
    """

    dropout_predictions = np.empty((0, n_samples, n_classes))
    softmax = nn.Softmax(dim=1)
    for i in range(forward_passes):
        predictions = np.empty((0, n_classes))
        model.eval()
        enable_dropout(model)
        for i, (image, label) in enumerate(data_loader):

            image = image.to(torch.device('cuda'))
            with torch.no_grad():
                output = model(image)
                output = softmax(output) # shape (n_samples, n_classes)
            predictions = np.vstack((predictions, output.cpu().numpy()))

        dropout_predictions = np.vstack((dropout_predictions,
                                         predictions[np.newaxis, :, :]))
        # dropout predictions - shape (forward_passes, n_samples, n_classes)
    
    # Calculating mean across multiple MCD forward passes 
    mean = np.mean(dropout_predictions, axis=0) # shape (n_samples, n_classes)

    # Calculating variance across multiple MCD forward passes 
    variance = np.var(dropout_predictions, axis=0) # shape (n_samples, n_classes)

    epsilon = sys.float_info.min
    # Calculating entropy across multiple MCD forward passes 
    entropy = -np.sum(mean*np.log(mean + epsilon), axis=-1) # shape (n_samples,)

    # Calculating mutual information across multiple MCD forward passes 
    mutual_info = entropy - np.mean(np.sum(-dropout_predictions*np.log(dropout_predictions + epsilon),
                                            axis=-1), axis=0) # shape (n_samples,)

继续讨论上述问题中发布的实现,通过首先将模型设置为训练模式 (model.train()),可以获得来自 T 个不同前向传递的多个预测.请注意,这是不可取的,因为如果模型中存在除 dropout 以外的层(例如 batch-norm),则会在预测中引入不需要的随机性.因此,最好的方法是将 dropout 层设置为训练模式,如上面的代码片段所示.

Moving on to the implementation which is posted in the question above, multiple predictions from T different forward passes are obtained by first setting the model to train mode (model.train()). Note that this is not desirable because unwanted stochasticity will be introduced in the predictions if there are layers other than dropout such as batch-norm in the model. Hence the best way is to just set the dropout layers to train mode as shown in the snippet above.

这篇关于在 pytorch 上使用 MC Dropout 测量不确定性的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆