Faisal Qureshi
Professor
Faculty of Science
Ontario Tech University
Oshawa ON Canada
http://vclab.science.ontariotechu.ca
© Faisal Qureshi
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Often used to achieve scale-invariant processing in the following contexts:
The basic idea for constructing Gaussian image pyramid is as follows:
import numpy as np
import cv2 as cv
import matplotlib.pyplot as plt
# %load solutions/image-pyramids/solution_01.py
#I = cv.imread('data/apple.jpg')
I = cv.imread('data/van-gogh.jpg')
I = cv.cvtColor(I, cv.COLOR_BGR2RGB)
I = cv.resize(I, (512, 512))
print('Shape of I = {}'.format(I.shape))
I[:,:,0] = cv.GaussianBlur(I[:,:,0], (5,5), 2, 2)
I[:,:,1] = cv.GaussianBlur(I[:,:,1], (5,5), 2, 2)
I[:,:,2] = cv.GaussianBlur(I[:,:,2], (5,5), 2, 2)
I2 = I[::2,::2,:]
print('Shape of I2 = {}'.format(I2.shape))
plt.imshow(I2)
# %load solutions/image-pyramids/gen_gaussian_pyramid.py
def gen_gaussian_pyramid(I, levels=6):
G = I.copy()
gpI = [G]
for i in range(levels):
G = cv.pyrDown(G)
gpI.append(G)
return gpI
# %load solutions/image-pyramids/gen_gaussian_pyramid.py
def gen_pyramid(I, levels=6):
G = I.copy()
pI = [G]
for i in range(levels):
G = G[::2,::2,:]
pI.append(G)
return pI
foo = gen_gaussian_pyramid(I, levels=9)
boo = gen_pyramid(I, levels=9)
for i in foo:
print(i.shape)
for i in boo:
print(i.shape)
plt.figure(figsize=(10,5))
plt.subplot(121)
plt.imshow(boo[5])
plt.subplot(122)
plt.imshow(foo[5])
plt.figure(figsize=(2,1))
plt.subplot(121)
plt.imshow(cv.resize(boo[5], (512,512)))
plt.subplot(122)
plt.imshow(cv.resize(foo[5], (512,512)))
OpenCV pyrDown
method performs steps 1 and 2 above. It uses the following kernel to blur the image (in step 1).
A = cv.imread('data/apple.jpg')
B = cv.imread('data/orange.jpg')
A = cv.cvtColor(A, cv.COLOR_BGR2RGB)
B = cv.cvtColor(B, cv.COLOR_BGR2RGB)
gpA = gen_gaussian_pyramid(A)
gpB = gen_gaussian_pyramid(B)
gp = gpB
num_levels = len(gp)
for i in range(num_levels):
rows = gp[i].shape[0]
cols = gp[i].shape[1]
print('level={}: size={}x{}'.format(i, rows, cols))
gp = gpA
level = 3
plt.figure(figsize=(10,10))
plt.title('Level {}'.format(level))
plt.imshow(gp[level])
Proposed by Burt and Adelson is a bandpass image decomposition derived from the Gaussian pyramid. Each level encodes information at a particular spatial frequency. The basic steps for constructing Laplacian pyramids are:
Convolve the original image $g_0$ with a lowpass filter $w$ (e.g., the Gaussian filter) and subsample it by two to create a reduced lowpass version of the image $g_1$. Recall that this is what function pyrDown
does.
Upsample this image (i.e., $g_1$) by inserting zeros in between each row and column and interpolating the missing values by convolving it with the same filter $w$ to create the expanded lowpass image $g'_1$ which is subtracted pixel by pixel from the original to give the detail image $L_0$. Specifically $L_0 = g_0 - g'_1$.
We define Laplacian operator as follows:
$$ \nabla^2 f = \frac{\partial^2 f}{\partial x^2} + \frac{\partial^2 f}{\partial y^2} $$In addition we can approximate the Laplacian of a Gaussian as follows:
We use this property when constructing Laplacian image pyramids above.
It is possible to reconstruct the original image $g_0$ from its Laplacian image pyramd consisting of $N+1$ detail images $L_i$, where $i \in [0,N]$ and the low pass image $g_N$.
Laplacian image pyramids are often used for compression. Instead of encoding $g_0$, we encode $L_i$, which are decorrelated and can be represented using far fewer bits.
# %load solutions/image-pyramids/gen_laplacian_pyramid.py
def gen_laplacian_pyramid(gpI):
"""gpI is a Gaussian pyramid generated using gen_gaussian_pyramid method found in py file of the same name."""
num_levels = len(gpI)-1
lpI = [gpI[num_levels]]
for i in range(num_levels,0,-1):
GE = cv.pyrUp(gpA[i])
L = cv.subtract(gpA[i-1],GE)
lpI.append(L)
return lpI
lpA = gen_laplacian_pyramid(gpA)
lpB = gen_laplacian_pyramid(gpB)
lp = lpA
num_levels = len(lp)
for i in range(num_levels):
rows = lp[i].shape[0]
cols = lp[i].shape[1]
print('level={}: size={}x{}'.format(i, cols, rows))
plt.imshow(lp[6])
lp = lpA
level = 3
plt.figure(figsize=(10,10))
plt.title('Level {}'.format(level))
plt.imshow(lp[level])
# %load solutions/image-pyramids/laplacian_blending.py
LS = []
for la, lb in zip(lpA, lpB):
rows, cols, dpt = la.shape
ls = np.hstack((la[:,0:cols//2,:], lb[:,cols//2:,:]))
LS.append(ls)
# %load solutions/image-pyramids/reconstruct_from_laplacian.py
ls_ = LS[0]
for i in range(1,6):
ls_ = cv.pyrUp(ls_)
ls_ = cv.add(ls_, LS[i])
real = np.hstack((A[:,:cols//2],B[:,cols//2:]))
plt.figure(figsize=(10,10))
plt.imshow(ls_)
plt.figure(figsize=(10,10))
plt.imshow(real)