Forward

Doing well in a computer vision course requires developing both mathematical and computational skills. Poor grasp of undergraduate level linear algebra or calculus will surely be an impediment. Strong programming skills are also needed to design "useful" computer vision systems. There are many great resources---books, tutorials, lecture notes, videos, etc.---that will help you learn computer vision theory and methods. Often times, however, these resources either assume strong programming skills or leave it up to the students to develop the needed skill. The lecture notes included below are aimed at individuals who may benefit from seeing computer vision theory and methods in action. I have attempted to provide Python code examples that make computer vision theory tangible.

Python is now de facto scientific computing language. There are of course many situations where Python perhaps is a poor choice for system development; studying computer vision at an undergraduate level is not one of those situations. Even if scientific computing is not your primary focus, it is probably a good idea that you have an above average working knowledge of Python.

Yes, even in the age of deep learning, it is important for you to learn computer vision fundamentals. This is especially true if you want to work at the edge of discovery. Many recent computer vision papers leverage "old computer vision knowledge" to develop deep learning systems that achieve state-of-the-art performance on some very challenging computer vision tasks. Knowing computer vision fundamentals will be your competetive advantage when it comes time for you to interview for a job.

You can reach me via e-mail with suggestions, comments, corrections, etc.

Notes

Image formation

Pinhole camera model
Homoegenous coordinates
Intrinsic and extrinsic camera matrices
Lens effects
Camera calibration

Camera calibration

OpenCV checkerboard based camera calibration
Image undistortion

Linear filtering

Linear Filtering in 1D
Cross-correlation
Convolution
Gaussian filter
Gaussian blurring
Separability
Relationship to Fourier transform
Integral images

Image pyramids

Gaussian image pyramids
Laplacian image pyramids
Laplacian blending

Frequency analysis

Frequency analysis of images
Fourier transform
Inverse Fourier transform
Discrete Fourier transform
- Fast Fourier transform (FFT)
Nyquist theorem
Convolution theorem
Properties of Fourier transform
Fourier transform of an Image
Why FFT?

Template matching

Sum of squared differences
Normalized sum of squared differences
Cross-correlation
Normalized cross-correlation
Correlation coefficient
Normalized correlation coefficient

Image derivatives

Why do we care about image gradients?
Computing image derivatives
Sobel filters
Gradient magnitude and directions
Visualizing image gradients

Edge detection

Origin of edges
Uses of edge detection
Canny edge detector

Identifying edge pixels using image gradient
Non-maxima suppression
Edge linking via hysteresis
Difference of Gaussian
Implementation in Python

Histograms

Histograms in 1D and 2D
- Construction
- Visualization
Non-uniform bins

Interest points

Uses of interest point detection
Interest point detection and its relationship to feature descriptors and feature matching
Interest point detection
Corner detection

This notebook focuses on interest point detection. We leave feature descriptors and feature matching for an other time.

Image sampling

Interpolation basics
Image sampling
Bilinear sampling

Local features

Characteristics of a good local feature
Raw patches as local features
SIFT descriptor
Feature detection and matching in OpenCV
Blob detection
MSER in OpenCV
Applications of local features

Median filtering

Median filtering

Bilateral filtering

Bilateral filtering

Texture analysis

Texture analysis
Filter banks
- Leung-Malik Filter (LM) Bank
  - LM filter construction
- Schmid Filter Bank
- Maximum Response Filter Bank

Least squares

Model fitting: Why?
Linear regression
Least squares
- 2D line fitting example
Total least squares
Aside: Singular Value Decomposition (SVD)

Robust least squares

Robust least squares
Outliers
Loss functions
- Linear loss
- Soft L1 loss
- Huber loss
- Cauchy loss
- arctan loss
- Bisquare loss
Incomplete data
Mixed data

RANSAC

RANSAC
RANSAC for 2D line fitting
RANSAC Algorithm
- Pros
- Cons
- Uses

Hough transform

Hough transform
Fitting lines to data
Polar representation of a line
Counting votes
Applications

Homography

Homography
Homography application: Image stitching
Solving for Homography

Feature tracking and optical flow

Motion cues
Recovering motion
Feature tracking
- Challenges
Lucas-Kanade tracker
Aperture problem
Motion estimation and its relationship to corner detection
Actual and percieved motion
Dealing with large motions
- Course-to-fine registation
Shi-Tomasi feature tracker
Perception of motion
Uses of motion
Optical flow
Lukas-Kanada optical flow

Epipolar geometry

The need for multiple views
Depth ambiguity
Estimating scene shape
- Shape from shading
- Shape from defocus
- Shape from texture
- Shape from perspective cues
- SHape from motion
Stereograms, human stereopsis, and disparity
Imaging geometry for a simple stereo system
Epipolar geometry
Fundamental matrix
Essential matrix
Rectification
Stereo matching
Active stereo

Action recognition

Spatio-temporal interest points
Videos as a bag of visual words
Localization in space and in time
Early deep learning models for action recognition

Expectation maximization and Latent semantic analysis

Finite mixture models
Gaussian Mixture Models (GMM)
EM for GMM
Probabilistic latent semantic analysis

Reference material

Computer Vision: Algorithms and Applications by Richard Szelski
Fundamentals of Computer Vison by Mubarak Shah
Multi View Geometry in Computer Vision by Richard Hartley and Andrew Zisserman
Computer Vision: Models, Learning, and Inference by Simon Prince

Other useful items

Sam Roweis Notes

Sam Roweis was a Canadian computer scientist specializing in Machine Learning. He is sadly no longer with us. In addition to being a superlative machine learning researcher, Sam was a passionate educator. Below I include his notes on linear algebra and probability. I re-discovered these notes thanks to Inmar Givoni.

Programming Resources

Copyright and License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Last update: 2025-03-27 21:24

Webify version: 4.1

Computer Vision

Faisal Qureshi