Fashion-MNIST

Description from the official website

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. We intend Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.

Contents

Overview

The MLDatasets.FashionMNIST sub-module provides a programmatic interface to download, load, and work with the Fashion-MNIST dataset.

using MLDatasets

# load full training set
train_x, train_y = FashionMNIST.traindata()

# load full test set
test_x,  test_y  = FashionMNIST.testdata()

The provided functions also allow for optional arguments, such as the directory dir where the dataset is located, or the specific observation indices that one wants to work with. For more information on the interface take a look at the documentation (e.g. ?FashionMNIST.traindata).

FunctionDescription
download([dir])Trigger (interactive) download of the dataset
classnames()Return the class names as a vector of strings
traintensor([T], [indices]; [dir])Load the training images as an array of eltype T
trainlabels([indices]; [dir])Load the labels for the training images
testtensor([T], [indices]; [dir])Load the test images as an array of eltype T
testlabels([indices]; [dir])Load the labels for the test images
traindata([T], [indices]; [dir])Load images and labels of the training data
testdata([T], [indices]; [dir])Load images and labels of the test data

This module also provides utility functions to make working with the Fashion-MNIST dataset in Julia more convenient.

FunctionDescription
convert2image(array)Convert the Fashion-MNIST tensor/matrix to a colorant array

To visualize an image or a prediction we provide the function convert2image to convert the given Fashion-MNIST horizontal-major tensor (or feature matrix) to a vertical-major Colorant array. The values are also color corrected according to the website's description, which means that the digits are black on a white background.

julia> FashionMNIST.convert2image(FashionMNIST.traintensor(1)) # first training image
28×28 Array{Gray{N0f8},2}:
[...]

API Documentation

MLDatasets.FashionMNISTModule

Fashion-MNIST

  • Authors: Han Xiao, Kashif Rasul, Roland Vollgraf
  • Website: https://github.com/zalandoresearch/fashion-mnist

Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. It can serve as a drop-in replacement for MNIST.

Interface

Utilities

Also, the FashionMNIST module is re-exporting convert2image from the MNIST module.

Trainingset

MLDatasets.FashionMNIST.traintensorFunction
traintensor([T = N0f8], [indices]; [dir]) -> Array{T}

Same as MNIST.traintensor but for the FashionMNIST dataset.

The corresponding resource file(s) of the dataset is/are expected to be located in the specified directory dir. If dir is omitted the directories in DataDeps.default_loadpath will be searched for an existing FashionMNIST subfolder. In case no such subfolder is found, dir will default to ~/.julia/datadeps/FashionMNIST. In the case that dir does not yet exist, a download prompt will be triggered. You can also use FashionMNIST.download([dir]) explicitly for pre-downloading (or re-downloading) the dataset. Please take a look at the documentation of the package DataDeps.jl for more detail and configuration options.

MLDatasets.FashionMNIST.trainlabelsFunction
trainlabels([indices]; [dir])

Returns the Fashion-MNIST trainset labels corresponding to the given indices as an Int or Vector{Int}. The values of the labels denote the zero-based class-index that they represent (see FashionMNIST.classnames for the corresponding names). If indices is omitted, all labels are returned.

julia> FashionMNIST.trainlabels() # full training set
60000-element Array{Int64,1}:
 9
 0
 ⋮
 0
 5

julia> FashionMNIST.trainlabels(1:3) # first three labels
3-element Array{Int64,1}:
 9
 0
 0

julia> y = FashionMNIST.trainlabels(1) # first label
9

julia> FashionMNIST.classnames()[y + 1] # corresponding name
"Ankle boot"

The corresponding resource file(s) of the dataset is/are expected to be located in the specified directory dir. If dir is omitted the directories in DataDeps.default_loadpath will be searched for an existing FashionMNIST subfolder. In case no such subfolder is found, dir will default to ~/.julia/datadeps/FashionMNIST. In the case that dir does not yet exist, a download prompt will be triggered. You can also use FashionMNIST.download([dir]) explicitly for pre-downloading (or re-downloading) the dataset. Please take a look at the documentation of the package DataDeps.jl for more detail and configuration options.

MLDatasets.FashionMNIST.traindataFunction
traindata([T = N0f8], [indices]; [dir]) -> images, labels

Same as MNIST.traindata but for the FashionMNIST dataset.

The corresponding resource file(s) of the dataset is/are expected to be located in the specified directory dir. If dir is omitted the directories in DataDeps.default_loadpath will be searched for an existing FashionMNIST subfolder. In case no such subfolder is found, dir will default to ~/.julia/datadeps/FashionMNIST. In the case that dir does not yet exist, a download prompt will be triggered. You can also use FashionMNIST.download([dir]) explicitly for pre-downloading (or re-downloading) the dataset. Please take a look at the documentation of the package DataDeps.jl for more detail and configuration options.

Take a look at FashionMNIST.traintensor and FashionMNIST.trainlabels for more information.

Testset

MLDatasets.FashionMNIST.testtensorFunction
testtensor([T = N0f8], [indices]; [dir]) -> Array{T}

Same as MNIST.testtensor but for the FashionMNIST dataset. ```

The corresponding resource file(s) of the dataset is/are expected to be located in the specified directory dir. If dir is omitted the directories in DataDeps.default_loadpath will be searched for an existing FashionMNIST subfolder. In case no such subfolder is found, dir will default to ~/.julia/datadeps/FashionMNIST. In the case that dir does not yet exist, a download prompt will be triggered. You can also use FashionMNIST.download([dir]) explicitly for pre-downloading (or re-downloading) the dataset. Please take a look at the documentation of the package DataDeps.jl for more detail and configuration options.

MLDatasets.FashionMNIST.testlabelsFunction
testlabels([indices]; [dir])

Returns the Fashion-MNIST testset labels corresponding to the given indices as an Int or Vector{Int}. The values of the labels denote the class-index that they represent (see FashionMNIST.classnames for the corresponding names). If indices is omitted, all labels are returned.

julia> FashionMNIST.testlabels() # full test set
10000-element Array{Int64,1}:
 9
 2
 ⋮
 1
 5

julia> FashionMNIST.testlabels(1:3) # first three labels
3-element Array{Int64,1}:
 9
 2
 1

julia> y = FashionMNIST.testlabels(1) # first label
9

julia> FashionMNIST.classnames()[y + 1] # corresponding name
"Ankle boot"

The corresponding resource file(s) of the dataset is/are expected to be located in the specified directory dir. If dir is omitted the directories in DataDeps.default_loadpath will be searched for an existing FashionMNIST subfolder. In case no such subfolder is found, dir will default to ~/.julia/datadeps/FashionMNIST. In the case that dir does not yet exist, a download prompt will be triggered. You can also use FashionMNIST.download([dir]) explicitly for pre-downloading (or re-downloading) the dataset. Please take a look at the documentation of the package DataDeps.jl for more detail and configuration options.

MLDatasets.FashionMNIST.testdataFunction
testdata([T = N0f8], [indices]; [dir]) -> images, labels

Same as MNIST.testdata but for the FashionMNIST dataset.

The corresponding resource file(s) of the dataset is/are expected to be located in the specified directory dir. If dir is omitted the directories in DataDeps.default_loadpath will be searched for an existing FashionMNIST subfolder. In case no such subfolder is found, dir will default to ~/.julia/datadeps/FashionMNIST. In the case that dir does not yet exist, a download prompt will be triggered. You can also use FashionMNIST.download([dir]) explicitly for pre-downloading (or re-downloading) the dataset. Please take a look at the documentation of the package DataDeps.jl for more detail and configuration options.

Take a look at FashionMNIST.testtensor and FashionMNIST.testlabels for more information.

Utilities

MLDatasets.FashionMNIST.downloadFunction
download([dir]; [i_accept_the_terms_of_use])

Trigger the (interactive) download of the full dataset into "dir". If no dir is provided the dataset will be downloaded into "~/.julia/datadeps/FashionMNIST".

This function will display an interactive dialog unless either the keyword parameter i_accept_the_terms_of_use or the environment variable DATADEPS_ALWAYS_ACCEPT is set to true. Note that using the data responsibly and respecting copyright/terms-of-use remains your responsibility.

Also, the FashionMNIST module is re-exporting convert2image from the MNIST module.

References

  • Authors: Han Xiao, Kashif Rasul, Roland Vollgraf

  • Website: https://github.com/zalandoresearch/fashion-mnist

  • [Han Xiao et al. 2017] Han Xiao, Kashif Rasul, and Roland Vollgraf. "Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms." arXiv:1708.07747