Vision Datasets

A collection of datasets for 2d computer vision.

Numerical arrays can be converted to color images using convert2image, and displayed in the terminal with the package ImageInTerminal.jl

Index

Documentation

MLDatasets.convert2imageFunction
convert2image(d, i)
convert2image(d, x)
convert2image(DType, x)

Convert the observation(s) i from dataset d to image(s). It can also convert a numerical array x.

In order to support a new dataset, e.g. MyDataset, implement convert2image(::Type{MyDataset}, x::AbstractArray).

Examples

julia> using MLDatasets, ImageInTerminal

julia> d = MNIST()

julia> convert2image(d, 1:2) 
# You should see 2 images in the terminal

julia> x = d[1].features;

julia> convert2image(MNIST, x) # or convert2image(d, x)
source
MLDatasets.CIFAR10Type
CIFAR10(; Tx=Float32, split=:train, dir=nothing)
CIFAR10([Tx, split])

The CIFAR10 dataset is a labeled subsets of the 80 million tiny images dataset. It consists of 60000 32x32 colour images in 10 classes, with 6000 images per class.

Arguments

  • You can pass a specific dir where to load or download the dataset, otherwise uses the default one.

  • split: selects the data partition. Can take the values :train or :test.

Fields

  • metadata: A dictionary containing additional information on the dataset.

  • features: An array storing the data features.

  • targets: An array storing the targets for supervised learning.

  • split.

Methods

  • dataset[i]: Return observation(s) i as a named tuple of features and targets.

  • dataset[:]: Return all observations as a named tuple of features and targets.

  • length(dataset): Number of observations.

  • convert2image converts features to RGB images.

Examples

julia> using MLDatasets: CIFAR10

julia> dataset = CIFAR10()
CIFAR10:
  metadata    =>    Dict{String, Any} with 2 entries
  split       =>    :train
  features    =>    32×32×3×50000 Array{Float32, 4}
  targets     =>    50000-element Vector{Int64}

julia> dataset[1:5].targets
5-element Vector{Int64}:
 6
 9
 9
 4
 1

julia> X, y = dataset[:];

julia> dataset = CIFAR10(Tx=Float64, split=:test)
CIFAR10:
  metadata    =>    Dict{String, Any} with 2 entries
  split       =>    :test
  features    =>    32×32×3×10000 Array{Float64, 4}
  targets     =>    10000-element Vector{Int64}

julia> dataset.metadata
Dict{String, Any} with 2 entries:
  "n_observations" => 10000
  "class_names"    => ["airplane", "automobile", "bird", "cat", "deer", "dog", "frog", "horse", "ship", "truck"]
source
MLDatasets.CIFAR100Type
CIFAR100(; Tx=Float32, split=:train, dir=nothing)
CIFAR100([Tx, split])

The CIFAR100 dataset is a labeled subsets of the 80 million tiny images dataset. It consists of 60000 32x32 colour images in 100 classes and 20 superclasses, with 600 images per class.

Return the CIFAR-100 trainset labels (coarse and fine) corresponding to the given indices as a tuple of two Int or two Vector{Int}. The variables returned are the coarse label(s) (Yc) and the fine label(s) (Yf) respectively.

Arguments

  • You can pass a specific dir where to load or download the dataset, otherwise uses the default one.

  • split: selects the data partition. Can take the values :train or :test.

Fields

  • metadata: A dictionary containing additional information on the dataset.

  • features: An array storing the data features.

  • targets: An array storing the targets for supervised learning.

  • split.

Methods

  • dataset[i]: Return observation(s) i as a named tuple of features and targets.

  • dataset[:]: Return all observations as a named tuple of features and targets.

  • length(dataset): Number of observations.

  • convert2image converts features to RGB images.

Examples

julia> dataset = CIFAR100()
CIFAR100:
  metadata    =>    Dict{String, Any} with 3 entries
  split       =>    :train
  features    =>    32×32×3×50000 Array{Float32, 4}
  targets     =>    (coarse = "50000-element Vector{Int64}", fine = "50000-element Vector{Int64}")

julia> dataset[1:5].targets
(coarse = [11, 15, 4, 14, 1], fine = [19, 29, 0, 11, 1])

julia> X, y = dataset[:];

julia> dataset.metadata
Dict{String, Any} with 3 entries:
  "n_observations"     => 50000
  "class_names_coarse" => ["aquatic_mammals", "fish", "flowers", "food_containers", "fruit_and_vegetables", "household_electrical_devices", "household_furniture", "insects", "large_carnivores", "large_man-made_…
  "class_names_fine"   => ["apple", "aquarium_fish", "baby", "bear", "beaver", "bed", "bee", "beetle", "bicycle", "bottle"  …  "train", "trout", "tulip", "turtle", "wardrobe", "whale", "willow_tree", "wolf", "w…
source
MLDatasets.EMNISTType
EMNIST(name; Tx=Float32, split=:train, dir=nothing)
EMNIST(name, [Tx, split])

The EMNIST dataset is a set of handwritten character digits derived from the NIST Special Database 19 (https://www.nist.gov/srd/nist-special-database-19) and converted to a 28x28 pixel image format and dataset structure that directly matches the MNIST dataset (http://yann.lecun.com/exdb/mnist/). Further information on the dataset contents and conversion process can be found in the paper available at https://arxiv.org/abs/1702.05373v1.

Arguments

  • name: name of the EMNIST dataset. Possible values are: :balanced, :byclass, :bymerge, :digits, :letters, :mnist.
  • split: selects the data partition. Can take the values :train or :test.
  • You can pass a specific dir where to load or download the dataset, otherwise uses the default one.

Fields

  • name.
  • split.
  • metadata: A dictionary containing additional information on the dataset.
  • features: An array storing the data features.
  • targets: An array storing the targets for supervised learning.

Methods

  • dataset[i]: Return observation(s) i as a named tuple of features and targets.

  • dataset[:]: Return all observations as a named tuple of features and targets.

  • length(dataset): Number of observations.

  • convert2image converts features to Gray images.

Examples

The images are loaded as a multi-dimensional array of eltype Tx. If Tx <: Integer, then all values will be within 0 and 255, otherwise the values are scaled to be between 0 and 1. EMNIST().features is a 3D array (i.e. a Array{Tx,3}), in WHN format (width, height, num_images). Labels are stored as a vector of integers in EMNIST().targets.

julia> using MLDatasets: EMNIST

julia> dataset = EMNIST(:letters, split=:train)
EMNIST:
  metadata    =>    Dict{String, Any} with 3 entries
  split       =>    :train
  features    =>    28×28×60000 Array{Float32, 3}
  targets     =>    60000-element Vector{Int64}

julia> dataset[1:5].targets
5-element Vector{Int64}:
7
2
1
0
4

julia> X, y = dataset[:];

julia> dataset = EMNIST(:balanced, Tx=UInt8, split=:test)
EMNIST:
  metadata    =>    Dict{String, Any} with 3 entries
  split       =>    :test
  features    =>    28×28×10000 Array{UInt8, 3}
  targets     =>    10000-element Vector{Int64}
source
MLDatasets.FashionMNISTType
FashionMNIST(; Tx=Float32, split=:train, dir=nothing)
FashionMNIST([Tx, split])

FashionMNIST is a dataset of Zalando's article images consisting of a training set of 60000 examples and a test set of 10000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. It can serve as a drop-in replacement for MNIST.

  • Authors: Han Xiao, Kashif Rasul, Roland Vollgraf
  • Website: https://github.com/zalandoresearch/fashion-mnist

See MNIST for details of the interface.

source
MLDatasets.MNISTType
MNIST(; Tx=Float32, split=:train, dir=nothing)
MNIST([Tx, split])

The MNIST database of handwritten digits.

  • Authors: Yann LeCun, Corinna Cortes, Christopher J.C. Burges
  • Website: http://yann.lecun.com/exdb/mnist/

MNIST is a classic image-classification dataset that is often used in small-scale machine learning experiments. It contains 70,000 images of handwritten digits. Each observation is a 28x28 pixel gray-scale image that depicts a handwritten version of 1 of the 10 possible digits (0-9).

Arguments

  • You can pass a specific dir where to load or download the dataset, otherwise uses the default one.

  • split: selects the data partition. Can take the values :train or :test.

Fields

  • metadata: A dictionary containing additional information on the dataset.

  • features: An array storing the data features.

  • targets: An array storing the targets for supervised learning.

  • split.

Methods

  • dataset[i]: Return observation(s) i as a named tuple of features and targets.

  • dataset[:]: Return all observations as a named tuple of features and targets.

  • length(dataset): Number of observations.

  • convert2image converts features to Gray images.

Examples

The images are loaded as a multi-dimensional array of eltype Tx. If Tx <: Integer, then all values will be within 0 and 255, otherwise the values are scaled to be between 0 and 1. MNIST().features is a 3D array (i.e. a Array{Tx,3}), in WHN format (width, height, num_images). Labels are stored as a vector of integers in MNIST().targets.

julia> using MLDatasets: MNIST

julia> dataset = MNIST(:train)
MNIST:
  metadata    =>    Dict{String, Any} with 3 entries
  split       =>    :train
  features    =>    28×28×60000 Array{Float32, 3}
  targets     =>    60000-element Vector{Int64}

julia> dataset[1:5].targets
5-element Vector{Int64}:
7
2
1
0
4

julia> X, y = dataset[:];

julia> dataset = MNIST(UInt8, :test)
MNIST:
  metadata    =>    Dict{String, Any} with 3 entries
  split       =>    :test
  features    =>    28×28×10000 Array{UInt8, 3}
  targets     =>    10000-element Vector{Int64}
source
MLDatasets.OmniglotType
Omniglot(; Tx=Float32, split=:train, dir=nothing)
Omniglot([Tx, split])

Omniglot data set for one-shot learning

  • Authors: Brenden M. Lake, Ruslan Salakhutdinov, Joshua B. Tenenbaum
  • Website: https://github.com/brendenlake/omniglot

The Omniglot data set is designed for developing more human-like learning algorithms. It contains 1623 different handwritten characters from 50 different alphabets. Each of the 1623 characters was drawn online via Amazon's Mechanical Turk by 20 different people. Each image is paired with stroke data, a sequences of [x,y,t] coordinates with time (t) in milliseconds.

Arguments

  • You can pass a specific dir where to load or download the dataset, otherwise uses the default one.

  • split: selects the data partition. Can take the values :train, :test, :small1, or :small2.

Fields

  • metadata: A dictionary containing additional information on the dataset.

  • features: An array storing the data features.

  • targets: An array storing the targets for supervised learning.

  • split.

Methods

  • dataset[i]: Return observation(s) i as a named tuple of features and targets.

  • dataset[:]: Return all observations as a named tuple of features and targets.

  • length(dataset): Number of observations.

  • convert2image converts features to Gray images.

Examples

The images are loaded as a multi-dimensional array of eltype Tx. All values will be 0 or 1. Omniglot().features is a 3D array (i.e. a Array{Tx,3}), in WHN format (width, height, num_images). Labels are stored as a vector of strings in Omniglot().targets.

julia> using MLDatasets: Omniglot

julia> dataset = Omniglot(:train)
Omniglot:
  metadata    =>    Dict{String, Any} with 3 entries
  split       =>    :train
  features    =>    105×105×19280 Array{Float32, 3}
  targets     =>    19280-element Vector{Int64}

julia> dataset[1:5].targets
5-element Vector{String}:
 "Arcadian"
 "Arcadian"
 "Arcadian"
 "Arcadian"
 "Arcadian"

julia> X, y = dataset[:];

julia> dataset = Omniglot(UInt8, :test)
Omniglot:
  metadata    =>    Dict{String, Any} with 3 entries
  split       =>    :test
  features    =>    105×105×13180 Array{UInt8, 3}
  targets     =>    13180-element Vector{Int64}
source
MLDatasets.SVHN2Type
SVHN2(; Tx=Float32, split=:train, dir=nothing)
SVHN2([Tx, split])

The Street View House Numbers (SVHN) Dataset.

  • Authors: Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng
  • Website: http://ufldl.stanford.edu/housenumbers

SVHN was obtained from house numbers in Google Street View images. As such they are quite diverse in terms of orientation and image background. Similar to MNIST, SVHN has 10 classes (the digits 0-9), but unlike MNIST there is more data and the images are a little bigger (32x32 instead of 28x28) with an additional RGB color channel. The dataset is split up into three subsets: 73257 digits for training, 26032 digits for testing, and 531131 additional to use as extra training data.

Arguments

  • You can pass a specific dir where to load or download the dataset, otherwise uses the default one.

  • split: selects the data partition. Can take the values :train, :test or :extra.

Fields

  • metadata: A dictionary containing additional information on the dataset.

  • features: An array storing the data features.

  • targets: An array storing the targets for supervised learning.

  • split.

Methods

  • dataset[i]: Return observation(s) i as a named tuple of features and targets.

  • dataset[:]: Return all observations as a named tuple of features and targets.

  • length(dataset): Number of observations.

  • convert2image converts features to RGB images.

Examples

julia> using MLDatasets: SVHN2

julia> using MLDatasets: SVHN2

julia> dataset = SVHN2()
SVHN2:
  metadata    =>    Dict{String, Any} with 2 entries
  split       =>    :train
  features    =>    32×32×3×73257 Array{Float32, 4}
  targets     =>    73257-element Vector{Int64}

julia> dataset[1:5].targets
5-element Vector{Int64}:
 1
 9
 2
 3
 2

julia> dataset.metadata
Dict{String, Any} with 2 entries:
  "n_observations" => 73257
  "class_names"    => ["1", "2", "3", "4", "5", "6", "7", "8", "9", "0"]
source