No description has been provided for this image

On CIFAR Dataset¶

Faisal Qureshi
faisal.qureshi@ontariotechu.ca
http://www.vclab.ca

What is CIFAR dataset?¶

  • CIFAR (Canadian Institute For Advanced Research) datasets are widely used in computer vision.
  • Two main versions:
    • CIFAR-10: 60,000 images (10 classes, 6,000 per class).
    • CIFAR-100: 60,000 images (100 classes, 600 per class).
  • Developed by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.

CIFAR-10 Dataset¶

  • 10 classes: Airplane, Automobile, Bird, Cat, Deer, Dog, Frog, Horse, Ship, Truck.
  • Image format: 32x32 RGB images.
  • Training set: 50,000 images.
  • Test set: 10,000 images.

CIFAR-100 Dataset¶

  • 100 classes grouped into 20 superclasses.
  • Each class has 600 images.
  • Follows the same structure as CIFAR-10.

Why Use CIFAR?¶

  • Challenging dataset for deep learning.
  • Diverse classes for object recognition.
  • Benchmarking for CNN architectures.

CIFAR-10 in Python¶

In [1]:
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Load CIFAR-10 dataset
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
dataset = datasets.CIFAR10(root='../datasets/common', train=True, transform=transform, download=True)
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

Sample CIFAR-10 Images¶

In [2]:
import matplotlib.pyplot as plt
import numpy as np

# Load dataset
cifar = datasets.CIFAR10(root='../datasets/common', train=True, transform=transforms.ToTensor(), download=True)
images, labels = zip(*[cifar[i] for i in range(10)])

# Plot images
fig, axes = plt.subplots(1, 10, figsize=(20, 4))
for i, ax in enumerate(axes):
    ax.imshow(np.transpose(images[i].numpy(), (1, 2, 0)))
    ax.set_title(cifar.classes[labels[i]])
    ax.axis('off')
plt.show()
No description has been provided for this image

Applications of CIFAR¶

  • Convolutional Neural Network (CNN) training
  • Object recognition and classification
  • Transfer learning and fine-tuning
  • Image augmentation experiments

CIFAR Challenges¶

  • Small image size (32 $\times$ 32) limits detail.
  • Intra-class variability (e.g., different dog breeds in the same class).
  • Susceptible to overfitting due to limited resolution.

Variants and Extensions¶

  • Tiny ImageNet: Larger dataset with 200 classes.
  • SVHN (Street View House Numbers): Digit classification similar to CIFAR but with real-world images.
  • ImageNet: Larger, more complex dataset for advanced deep learning models.

Conclusion¶

  • CIFAR remains a fundamental dataset for training and evaluating deep learning models.
  • Used in academic research and industry applications.
  • Serves as a stepping stone for working with larger datasets like ImageNet.
No description has been provided for this image