如何读取R中的MNIST数据库? [英] How to read MNIST database in R?
问题描述
我目前正在研究一个案例研究,需要在,这可能是有用的。
I'm currently working on a case study for which I need to work on the MNIST database.
The files in this site are said to be in IDX file format. I tried to take a look at these files using basic text editors like notepad and wordpad, but no luck there.
Expecting that they would be in the high endian format, I tried the following:
to.read = file("t10k-images.idx3-ubyte", "rb")
readBin(to.read, integer(), n=100, endian = "high")
I got some numbers as output, but none of them made any sense to me.
Can anyone please explain how to read the MNIST database files in R and how to interpret those numbers? Thanks.
endian="big"
, not "high"
:
> to.read = file("~/Downloads/t10k-images-idx3-ubyte", "rb")
magic number:
> readBin(to.read, integer(), n=1, endian="big")
[1] 2051
number of images:
> readBin(to.read, integer(), n=1, endian="big")
[1] 10000
number of rows:
> readBin(to.read, integer(), n=1, endian="big")
[1] 28
number of columns:
> readBin(to.read, integer(), n=1, endian="big")
[1] 28
here comes the data:
> readBin(to.read, integer(), n=1, endian="big")
[1] 0
> readBin(to.read, integer(), n=1, endian="big")
[1] 0
as per the training set image data description on the web site.
Now you just need to loop and read 28*28 byte chunks into matrices.
Start again:
> to.read = file("~/Downloads/t10k-images-idx3-ubyte", "rb")
skip header:
> readBin(to.read, integer(), n=4, endian="big")
[1] 2051 10000 28 28
should really get the 28,28 from the header read but hard-coded here:
> m = matrix(readBin(to.read,integer(), size=1, n=28*28, endian="big"),28,28)
> image(m)
Might need to transpose or flip the matrix, I think its an upside-down "7".
par(mfrow=c(5,5))
par(mar=c(0,0))
for(i in 1:25){m = matrix(readBin(to.read,integer(), size=1, n=28*28, endian="big"),28,28);image(m[,28:1])}
gets you:
Oh, and google leads me to: http://www.inside-r.org/packages/cran/darch/docs/readMNIST which might be useful.
这篇关于如何读取R中的MNIST数据库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!