如何在MATLAB中加载MNIST数字和标签数据? [英] How do I load in the MNIST digits and label data in MATLAB?

查看:434
本文介绍了如何在MATLAB中加载MNIST数字和标签数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试运行链接中给出的代码

I am trying to run the code given in the link

https://github.com/bd622/DiscretHashing

Discrete Hashing是一种降维的方法,用于近似最近邻搜索。我想在 http://yann.lecun中提供的MNIST数据库中加载实现。 COM / exdb / MNIST / 。我从压缩的gz格式中提取了文件。

Discrete Hashing is a method for dimensionality reduction that is used on approximate nearest neighbor search. I want to load in the implementation on the MNIST database that is available in http://yann.lecun.com/exdb/mnist/. I have extracted the files from their compressed gz format.

问题1:

使用该解决方案读取读取MNIST图像数据库二进制文件中提供的MNIST数据库在MATLAB中

我收到以下错误:

Error using fread
Invalid file identifier.  Use fopen to generate a valid file identifier.

Error in Reading (line 7)
A = fread(fid, 1, 'uint32');

以下是代码:

clear all;
close all;

%//Open file
fid = fopen('t10k-images-idx3-ubyte', 'r');

A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));

%//For each image, store into an individual cell
imageCellArray = cell(1, totalImages);
for k = 1 : totalImages
    %//Read in numRows*numCols pixels at a time
    A = fread(fid, numRows*numCols, 'uint8');
    %//Reshape so that it becomes a matrix
    %//We are actually reading this in column major format
    %//so we need to transpose this at the end
    imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end

%//Close the file
fclose(fid);

更新:问题1已解决且修订后的代码为

clear all;
close all;

%//Open file
fid = fopen('t10k-images.idx3-ubyte', 'r');

A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));

%//Read in total number of images
%//A = fread(fid, 4, 'uint8');
%//totalImages = sum(bitshift(A', [24 16 8 0]));

%//OR
A = fread(fid, 1, 'uint32');
totalImages = swapbytes(uint32(A));

%//Read in number of rows
%//A = fread(fid, 4, 'uint8');
%//numRows = sum(bitshift(A', [24 16 8 0]));

%//OR
A = fread(fid, 1, 'uint32');
numRows = swapbytes(uint32(A));

%//Read in number of columns
%//A = fread(fid, 4, 'uint8');
%//numCols = sum(bitshift(A', [24 16 8 0]));

%// OR
A = fread(fid, 1, 'uint32');
numCols = swapbytes(uint32(A));

for k = 1 : totalImages
    %//Read in numRows*numCols pixels at a time
    A = fread(fid, numRows*numCols, 'uint8');
    %//Reshape so that it becomes a matrix
    %//We are actually reading this in column major format
    %//so we need to transpose this at the end
    imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end

%//Close the file
fclose(fid);

问题2:

我无法理解如何在代码中应用MNIST的4个文件。代码包含变量

I cannot understand how to apply the 4 files of MNIST in the code. The code contains variables

traindata = double(traindata);
testdata = double(testdata);

如何准备MNIST数据库以便我可以申请实施?

How do I prepare the MNIST database so that I can apply to the implementation?

更新:我实施了解决方案,但我一直收到此错误

UPDATE : I implemented the solution but I keep getting this error

Error using fread
Invalid file identifier.  Use fopen to generate a valid file identifier.

Error in mnist_parse (line 11)
A = fread(fid1, 1, 'uint32');

这些是文件

demo.m %这是调用函数读取MNIST数据的主文件

demo.m % this is the main file that calls the function to read in the MNIST data

clear all
clc
[Trainimages, Trainlabels] = mnist_parse('C:\Users\Desktop\MNIST\train-images-idx3-ubyte', 'C:\Users\Desktop\MNIST\train-labels-idx1-ubyte');

[Testimages, Testlabels] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');

k=5;
digit = images(:,:,k);
lbl = label(k);







 function [images, labels] = mnist_parse(path_to_digits, path_to_labels)

% Open files
fid1 = fopen(path_to_digits, 'r');

% The labels file
fid2 = fopen(path_to_labels, 'r');

% Read in magic numbers for both files
A = fread(fid1, 1, 'uint32');
magicNumber1 = swapbytes(uint32(A)); % Should be 2051
fprintf('Magic Number - Images: %d\n', magicNumber1);

A = fread(fid2, 1, 'uint32');
magicNumber2 = swapbytes(uint32(A)); % Should be 2049
fprintf('Magic Number - Labels: %d\n', magicNumber2);

% Read in total number of images
% Ensure that this number matches with the labels file
A = fread(fid1, 1, 'uint32');
totalImages = swapbytes(uint32(A));
A = fread(fid2, 1, 'uint32');
if totalImages ~= swapbytes(uint32(A))
    error('Total number of images read from images and labels files are not the same');
end
fprintf('Total number of images: %d\n', totalImages);

% Read in number of rows
A = fread(fid1, 1, 'uint32');
numRows = swapbytes(uint32(A));

% Read in number of columns
A = fread(fid1, 1, 'uint32');
numCols = swapbytes(uint32(A));

fprintf('Dimensions of each digit: %d x %d\n', numRows, numCols);

% For each image, store into an individual slice
images = zeros(numRows, numCols, totalImages, 'uint8');
for k = 1 : totalImages
    % Read in numRows*numCols pixels at a time
    A = fread(fid1, numRows*numCols, 'uint8');

    % Reshape so that it becomes a matrix
    % We are actually reading this in column major format
    % so we need to transpose this at the end
    images(:,:,k) = reshape(uint8(A), numCols, numRows).';
end

% Read in the labels
labels = fread(fid2, totalImages, 'uint8');

% Close the files
fclose(fid1);
fclose(fid2);

end


推荐答案

我是你所说的方法#1的原作者。读入训练数据和测试标签的过程非常简单。在读取图像方面,您在上面显示的代码完美地读取文件并采用单元格数组格式。但是,您缺少文件内图像,行和列的读数。请注意,此文件的MNIST格式采用以下方式。左列是您相对于开头引用的字节偏移量:

I am the original author of Method #1 that you spoke of. The process to read in the training data and test labels is quite simple. In terms of reading in images, the code that you showed above reads the files perfectly and is in a cell array format. However, you are missing reading in the number of images, rows and columns inside the file. Take note that the MNIST format for this file is in the following fashion. The left column is the offset in bytes you are referencing with respect to the beginning:

[offset] [type]          [value]          [description]
0000     32 bit integer  0x00000803(2051) magic number
0004     32 bit integer  60000            number of images
0008     32 bit integer  28               number of rows
0012     32 bit integer  28               number of columns
0016     unsigned byte   ??               pixel
0017     unsigned byte   ??               pixel
........
xxxx     unsigned byte   ??               pixel

前四个字节是一个幻数:2051以确保您正在读取文件正常。接下来的四个字节表示图像的总数,然后接下来的四个字节是行,最后接下来的四个字节是列。应该有60000张图像,大小为28行×28列。在此之后,像素以行主格式交错,因此您必须循环一系列28 x 28像素并存储它们。在这种情况下,我将它们存储在一个单元格数组中,并且该单元格数组中的每个元素都是一位数。同样的格式也适用于测试数据,但是有10000个图像。

The first four bytes are a magic number: 2051 to ensure that you're reading in the file properly. The next four bytes denote the total number of images, then the next four bytes are the rows and finally the next four bytes are the columns. There should be 60000 images of size 28 rows by 28 columns. After this, the pixels are interleaved in row major format so you have to loop over series of 28 x 28 pixels and store them. In this case, I've stored them in a cell array and each element in this cell array would be one digit. The same format is for the test data as well, but there are 10000 images instead.

至于实际标签,它的格式大致相同,但有一些细微差别:

As for the actual labels, it's roughly the same format but there are some slight differences:

[offset] [type]          [value]          [description]
0000     32 bit integer  0x00000801(2049) magic number (MSB first)
0004     32 bit integer  60000            number of items
0008     unsigned byte   ??               label
0009     unsigned byte   ??               label
........
xxxx     unsigned byte   ??               label

前四个字节是一个幻数:2049,那么第二组四个字节告诉你有多少标签,最后数据集中的每个对应数字恰好有1个字节。测试数据也是相同的格式,但有10000个标签。因此,一旦读入标签集中的必要数据,您只需要一个 fread 调用并确保数据是无符号的8位整数,以便在其余部分读取的标签。

The first four bytes are a magic number: 2049, then the second set of four bytes tells you how many labels there are and finally there is exactly 1 byte for each corresponding digit in the dataset. The test data is also the same format but there are 10000 labels. As such, once you read in the necessary data in the label set, you just need one fread call and ensure that the data is unsigned 8-bit integer to read in the rest of the labels.

现在你必须使用 swapbytes 的原因是因为MATLAB将以little-endian格式读入数据,意味着首先读取一组字节中的最低有效字节。完成后,使用 swapbytes 重新排列此订单。

Now the reason why you have to use swapbytes is because MATLAB will read in the data in little-endian format, meaning that the least significant byte from a set of bytes is read in first. You use swapbytes to rearrange this order when you're done.

因此,我为你修改了这段代码,这是一个实际的函数,它接受一组两个字符串:数字图像文件的完整路径和数字的完整路径。我还更改了代码,使图像是一个3D数字矩阵,而不是单元格数组,以便加快处理速度。请注意,当您开始读取实际图像数据时,每个像素都是无符号的8位整数,因此无需进行任何字节交换。只有在一个 fread 电话中读取多个字节时才需要这个:

As such, I have modified this code for you so that it's an actual function that takes in a set of two strings: The full path to the image file of digits and the full path to the digits. I have also changed the code so that the images are a 3D numeric matrix as opposed to a cell array for faster processing. Take note that when you start reading in the actual image data, each pixel is unsigned 8-bit integer, so there's no need to do any swapping of bytes. This was only required when reading in multiple bytes in one fread call:

function [images, labels] = mnist_parse(path_to_digits, path_to_labels)

% Open files
fid1 = fopen(path_to_digits, 'r');

% The labels file
fid2 = fopen(path_to_labels, 'r');

% Read in magic numbers for both files
A = fread(fid1, 1, 'uint32');
magicNumber1 = swapbytes(uint32(A)); % Should be 2051
fprintf('Magic Number - Images: %d\n', magicNumber1);

A = fread(fid2, 1, 'uint32');
magicNumber2 = swapbytes(uint32(A)); % Should be 2049
fprintf('Magic Number - Labels: %d\n', magicNumber2);

% Read in total number of images
% Ensure that this number matches with the labels file
A = fread(fid1, 1, 'uint32');
totalImages = swapbytes(uint32(A));
A = fread(fid2, 1, 'uint32');
if totalImages ~= swapbytes(uint32(A))
    error('Total number of images read from images and labels files are not the same');
end
fprintf('Total number of images: %d\n', totalImages);

% Read in number of rows
A = fread(fid1, 1, 'uint32');
numRows = swapbytes(uint32(A));

% Read in number of columns
A = fread(fid1, 1, 'uint32');
numCols = swapbytes(uint32(A));

fprintf('Dimensions of each digit: %d x %d\n', numRows, numCols);

% For each image, store into an individual slice
images = zeros(numRows, numCols, totalImages, 'uint8');
for k = 1 : totalImages
    % Read in numRows*numCols pixels at a time
    A = fread(fid1, numRows*numCols, 'uint8');

    % Reshape so that it becomes a matrix
    % We are actually reading this in column major format
    % so we need to transpose this at the end
    images(:,:,k) = reshape(uint8(A), numCols, numRows).';
end

% Read in the labels
labels = fread(fid2, totalImages, 'uint8');

% Close the files
fclose(fid1);
fclose(fid2);

end

要调用此函数,只需指定两者的路径图像文件和标签文件。假设您在文件所在的同一目录中运行此文件,您可以对培训图像执行以下操作之一:

To call this function, simply specify the path to both the image file and the labels file. Assuming you are running this file in the same directory where the files are located, you would do one of the following for the training images:

[images, labels] = mnist_parse('train-images-idx3-ubyte', 'train-labels-idx1-ubyte');

此外,您还可以对测试图像执行以下操作:

Also, you would do the following for the test images:

[images, labels] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');

要访问 k 位数,只会这样做:

To access the kth digit, you would simply do:

digit = images(:,:,k);

k 数字的相应标签将是:

The corresponding label for the kth digit would be:

lbl = label(k);






最终将此数据转换为以下格式:对于我在Github上看到的代码可以接受,他们假设行对应于训练示例,列对应于特征。如果您希望使用此格式,只需重塑数据,以便图像像素分布在列上。


To finally get this data into a format that is acceptable for that code that I have seen on Github, they assume that the rows correspond to training examples and the columns correspond to features. If you wish to have this format, simply reshape the data so that the image pixels are spread out over the columns.

因此,只需执行此操作这个:

Therefore, just do this:

[trainingdata, traingnd] = mnist_parse('train-images-idx3-ubyte', 'train-labels-idx1-ubyte');
trainingdata = double(reshape(trainingdata, size(trainingdata,1)*size(trainingdata,2), []).');
traingnd = double(traingnd);

[testdata, testgnd] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');
testdata = double(reshape(testdata, size(testdata,1)*size(testdata_data,2), []).');
testgnd = double(testgnd);

以上使用与脚本中相同的变量,因此您应该可以将其插入其中应该管用。第二行重新整形矩阵,使每个数字都在一列中,但我们需要调换它以使每个数字都在一列中。我们还需要转换为 double ,这就是Github代码正在做的事情。相同的逻辑应用于测试数据。另请注意,我已将训练和测试标签明确地转换为 double ,以确保您决定在此数据上使用的任何算法的最大兼容性。

The above uses the same variables as in the script so you should be able to plug this in and it should work. The second line reshapes the matrix so that each digit is in a column, but we need to transpose this so that each digit is in a column. We also need to cast to double as that is what the Github code is doing. The same logic is applied to the test data. Also take note that I've explicitly cast the training and test labels to double to ensure maximum compatibility in whatever algorithms you decide to use on this data.

快乐的数字黑客攻击!

Happy digit hacking!

这篇关于如何在MATLAB中加载MNIST数字和标签数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆