Faisal Qureshi
Professor
Faculty of Science
Ontario Tech University
Oshawa ON Canada
http://vclab.science.ontariotechu.ca
© Faisal Qureshi
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
import numpy as np
import scipy as sp
from scipy import signal
import cv2 as cv
import matplotlib.pyplot as plt
# Load image of Van Gogh, convert from BGR to RGB image
I = cv.imread('data/van-gogh.jpg')
I = cv.cvtColor(I, cv.COLOR_BGR2RGB)
plt.figure(figsize=(7,7))
plt.imshow(I)
plt.xticks([])
plt.yticks([]);
Let's plot a single row of the red channel of image $I$, say, row 200.
row = 200
rowI = I[row,:,0]
plt.figure()
plt.plot(rowI,'-')
plt.plot(rowI,'r.');
signal = rowI[0:5]
kernel = np.array([1,1,1,1,1]) / 5
print(rowI[0:5])
print('A = ', np.mean(rowI[0:5]))
print('B = ', np.dot(kernel, signal))
Let's perform a moving average on this row. We assume window with half-width equal to $w$.
width = 5
signal = rowI
kernel = np.ones(2*width+1)/(2*width+1) # Array to represent the window (constant-valued)
result = np.ones(len(signal)-2*width) # Array to store computed moving averages
for i in range(len(result)):
centre = i + width
result[i] = np.dot(kernel, signal[centre-width:centre+width+1])
# Generate a plot of the original 1D signal and its average
plt.figure()
plt.plot(signal,'r.', alpha=0.3)
plt.plot(signal,'-', alpha=0.3)
plt.plot(np.arange(width, len(signal)-width),result,'k-');
Convolution and cross-correlation give the same results for symmetric kernels
kernel = np.array([3,1,2,1,3])
flipped_kernel = np.flip(kernel)
print('Symmetric kernel')
print(f'kernel={kernel}, flipped={flipped_kernel}')
kernel = np.array([1,1,2,2,3])
flipped_kernel = np.flip(kernel)
print('Asymmetric kernel')
print(f'kernel={kernel}, flipped={flipped_kernel}')
Compute $\mathbf{f} \ast \mathbf{h}$ given
$$ \mathbf{f} = \left[ \begin{array}{cccccccc} 1 & 3 & 4 & 1 & 10 & 3 & 0 & 1 \end{array} \right] $$and
$$ \mathbf{h} = \left[ \begin{array}{ccc} 1 & 0 & -1 \end{array}\right] $$Solution:
# %load solutions/linear-filtering/solution_01.py
# Computing a 1D convolution from scratch
f = np.array([1,3,4,1,10,3,0,1])
h = np.array([1,0,-1])
width = 1
result = np.ones(len(f)-2*width) # Array to store computed moving averages
for i in range(len(result)):
centre = i + width
result[i] = np.dot(h[::-1], f[centre-width:centre+width+1])
print(f'signal f:\t\t{f}')
print(f'kernel h:\t\t{h}')
print(f'convolution (f*h):\t{result}')
NumPy, of course, has a builtin function numpy.convolve
for convolution that wraps the slow loop in compiled code. Notice that, with the definition above, if the original array has length $N$ and the convolution kernel has width $2w+1$, the convolved array has length $N-2w$. In numpy.convolve
, this behaviour is enforced using the keyword argument mode='valid'
.
convolve
)¶Re-compute $\mathbf{f} \ast \mathbf{h}$ using numpy.convolve
given
and
$$ \mathbf{h} = \left[ \begin{array}{ccc} 1 & 0 & -1 \end{array}\right]. $$Solution:
# %load solutions/linear-filtering/solution_02.py
# Computing a 1D convolution using np.convolve
f = np.array([1, 3, 4, 1, 10, 3, 0, 1])
h = np.array([1, 0, -1])
r = np.convolve(f, h, mode='valid')
print(f'signal f:\t\t{f}')
print(f'kernel h:\t\t{h}')
print(f'convolution (f*h):\t{r}')
Tasks:
np.random.seed(0)
n = 20
x = np.linspace(0, np.pi, n)
y = np.sin(x)
mu = 0.0
sigma = 0.2
y_noisy = y + np.random.normal(mu, sigma, n)
plt.plot(x,y,'.-', label='data', alpha=0.5)
plt.plot(x,y_noisy,'.-r', label='noisy data')
plt.xlim(-.4,3.5)
plt.ylim(-1,2)
plt.xticks([])
plt.yticks([])
plt.legend(['data', 'noisy data']);
Solution:
# %load solutions/linear-filtering/solution_03.py
f = y_noisy
width = 1
h = np.ones(2*width+1)/(2*width+1)
result = np.ones(len(f)-2*width) # Array to store computed moving averages
for i in range(len(result)):
centre = i + width
result[i] = np.dot(h[::-1], f[centre-width:centre+width+1])
plt.plot(x,y,'.-', label='data', alpha=0.3)
plt.plot(x,y_noisy,'.-r', label='noisy data', alpha=0.5)
plt.plot(x[width:-width],result,'.-b', label='noisy data')
plt.xlim(-.4,3.5)
plt.ylim(-1,2)
plt.xticks([])
plt.yticks([])
plt.legend(['data', 'noisy data']);
Note that lengths of x and result are not the same?
Here $\mu$ and $\sigma$ refer to the mean and standard deviation of this Gaussian.
Given $n$ data points $\{x_1, x_2, \cdots, x_n \}$
$$ \mu = \mathrm{E}[x] = \frac{1}{n} \sum_{i=1}^n x_i $$$$ \sigma^2 = \mathrm{E}[(x-\mu)^2] = \frac{1}{n} \sum_{i=1}^n (x_i - \mu)^2 $$mu = 2
sigma = 3
def g(mu, s, x):
exponent = -((x - mu)**2)/(2*s*s)
constant = 1/(s * np.sqrt(2 * np.pi))
return constant * np.exp( exponent )
g_ = np.empty(20)
for i in range(-10,10):
g_[i-10] = g(mu, sigma, i)
print(f'Gaussian with mu={mu} and sigma={sigma} evaluated at {i} is {g_[i-10]}')
plt.plot(g_);
Vectorized code for evaluating Gaussian over a range of values
def gaussian1d(mu, sig, n, normalized=True):
s = n//2
x = np.linspace(-s,s,n)
exponent_factor = np.exp(-np.power(x - mu, 2.) / (2 * np.power(sig, 2.)))
normalizing_factor = 1. if not normalized else 1./(sig * np.sqrt(2*np.pi))
return normalizing_factor * exponent_factor
n = 9
width = n // 2
sigma = 1.4
g1 = gaussian1d(0.0, sigma, n, normalized=True)
g2 = gaussian1d(0.0, sigma, n, normalized=False)
plt.plot(np.linspace(-width,width,n), g1, 'r-.', label='Normalized')
plt.plot(np.linspace(-width,width,n), g2, 'b-o', label='Not normalized')
plt.title('Gaussian kernels')
plt.legend(loc='center left');
# %load solutions/linear-filtering/solution_05.py
y_smoothed = np.convolve(y_noisy, g1, mode='valid')
plt.plot(x,y,'b', label='data', alpha=0.3)
plt.plot(x,y_noisy,'r', label='noisy data', alpha=0.5)
plt.plot(x[width:-width],y_smoothed,'k', label='smoothed data')
plt.xlim(-.4,3.5)
plt.ylim(-1,2)
plt.xticks([])
plt.yticks([])
plt.legend(loc='upper left')
We will return to this issue later.
Convolution is equivalent to flipping the filter (kernel) in both directions. Convolution and cross-correlation are the same for symmetric filters.
f = np.linspace(0,15,16).reshape((4,4))
h = np.ones((3,3))
print(f'f = \n{f}')
print(f'h = \n{h}')
Compute $(\mathbf{f} \ast \mathbf{k} )_{i,j}$ where $(i,j) = (1,2)$.
Solution:
# Your answer goes here
foo = f[:3,1:]
print(foo)
print(np.reshape(foo, (9)))
print(np.reshape(h, (9)))
print(np.dot(np.reshape(foo, (9)), np.reshape(h, (9))))
A 3-by-3 summing kernel
average_3_by_3 = np.array([[1,1,1],
[1,1,1],
[1,1,1]], dtype='float32')
Construct a 3-by-3 averaging kernel.
Solution:
average_3_by_3 = average_3_by_3/np.sum(average_3_by_3)
print(f'average_3_by_3 = \n{average_3_by_3}')
average_3_by_3 = np.ones(9).reshape(3,3) / 9
average_5_by_5 = np.ones(25).reshape(5,5) / 25
average_7_by_7 = np.ones(49).reshape(7,7) / 49
print(f'average_5_by_5 = \n{average_5_by_5}')
I_gray = cv.cvtColor(I, cv.COLOR_BGR2GRAY)
I3 = sp.signal.convolve2d(I_gray, average_3_by_3, mode='same')
I5 = sp.signal.convolve2d(I_gray, average_5_by_5, mode='same')
I7 = sp.signal.convolve2d(I_gray, average_7_by_7, mode='same')
plt.figure(figsize=(20,10))
plt.subplot(241)
plt.imshow(I_gray, cmap='gray')
plt.subplot(242)
plt.imshow(I3, cmap='gray')
plt.subplot(243)
plt.imshow(I5, cmap='gray')
plt.subplot(244)
plt.imshow(I7, cmap='gray')
plt.subplot(245)
plt.imshow(I_gray[215:290,175:250], cmap='gray')
plt.subplot(246)
plt.imshow(I3[215:290,175:250], cmap='gray')
plt.subplot(247)
plt.imshow(I5[215:290,175:250], cmap='gray')
plt.subplot(248)
plt.imshow(I7[215:290,175:250], cmap='gray');
shift_right = np.array([[0,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,1],
[0,0,0,0,0],
[0,0,0,0,0]], dtype='float32')
shift_left = np.array([[0,0,0,0,0],
[0,0,0,0,0],
[1,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0]], dtype='float32')
sobel_x = np.array([[1,2,1],
[0,0,0],
[-1,-2,-1]], dtype='float32')
E_horizontal = sp.signal.convolve2d(I_gray, sobel_x, mode='same')
plt.figure(figsize=(20,10))
plt.subplot(121)
plt.imshow(I_gray, cmap='gray')
plt.subplot(122)
plt.imshow(E_horizontal, cmap='gray');
sobel_y = np.array([[1,0,-1],
[2,0,-2],
[1,0,-1]], dtype='float32')
E_vertical = sp.signal.convolve2d(I_gray, sobel_y, mode='same')
plt.figure(figsize=(20,10))
plt.subplot(121)
plt.imshow(I_gray, cmap='gray')
plt.subplot(122)
plt.imshow(E_vertical, cmap='gray')
Gaussian function in $k$ dimensions is $$ G(\mathbf{\mu},\Sigma,\mathbf{x}) = \frac{\exp ^{-\frac{1}{2} (\mathbf{x} - \mathbf{\mu})^T \Sigma^{-1} (\mathbf{x} - \mathbf{\mu})}}{\sqrt{ (2 \pi)^{k} \det(\Sigma) }} $$ where $\Sigma\in\mathbb{R}^{k\times k}$ is the (symmetric positive definite) covariance matrix, and $\mu\in\mathbb{R}^{k\times1}$ is the mean.
Given an $n \times d$ data matrix
$$ \mathbf{X} = \left[ \begin{array}{cccc} x_{11} & x_{12} & \cdots & x_{1d} \\ x_{21} & x_{22} & \cdots & x_{2d} \\ \vdots & \vdots & & \vdots \\ x_{n1} & x_{n2} & \cdots & x_{nd} \\ \end{array} \right], $$where each row denotes a data point and each column includes values from all samples for a particular dimension.
Then
$$ \mu_j = \frac{1}{n} \sum_{i=1}^n x_{ij} $$and covariance is a $d \times d$ matrix whose entries are computed as follows
$$ Cov_{(l,m)} = \frac{1}{n} \sum_{i=1}^n (x_{il} - \mu_l) (x_{im} - \mu_m) $$# Evaluates 2D Gaussian on xy grid.
# xy is two channel 2D matrix. The first channel stores x coordinates
# [-1 0 1]
# [-1 0 1]
# [-1 0 1]
# and the second channel stores y coordinates
# [-1 -1 -1]
# [0 0 0]
# [1 1 1]
# So then we can pick out an xy value using xy[i,j,:].
# Check out gaussian2_n() to see how you can construct such
# an xy using numpy
#
# For gaussian2_xy() and gaussian2_n() methods
# mean is a 2x1 vector and cov is a 2x2 matrix
def gaussian2_xy(mean, cov, xy):
invcov = np.linalg.inv(cov)
results = np.ones([xy.shape[0], xy.shape[1]])
for x in range(0, xy.shape[0]):
for y in range(0, xy.shape[1]):
v = xy[x,y,:].reshape(2,1) - mean
results[x,y] = np.dot(np.dot(np.transpose(v), invcov), v)
results = np.exp( - results / 2 )
return results
def gaussian2_n(mean, cov, n):
s = n//2
x = np.linspace(-s,s,n)
y = np.linspace(-s,s,n)
xc, yc = np.meshgrid(x, y)
xy = np.zeros([n, n, 2])
xy[:,:,0] = xc
xy[:,:,1] = yc
return gaussian2_xy(mean, cov, xy), xc, yc
def gaussian2d(var, n):
mean = np.array([0, 0])
mean = mean.reshape(2,1)
cov = np.array([[var,0],[0,var]])
k, xc, yc = gaussian2_n(mean, cov, n)
return k
n = 111
mean = np.array([0, 0])
mean = mean.reshape(2,1)
cov = np.array([[7,0],[0,7]])
g2d_kernel, xc, yc = gaussian2_n(mean, cov, 17)
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = plt.axes(projection='3d')
surf = ax.plot_surface(xc, yc, g2d_kernel,rstride=1, cstride=1, cmap='coolwarm',
linewidth=0, antialiased=False)
a = np.array([1,2,3])
b = np.array([4,5,6])
sum = 0
for i in range(3):
sum = sum + a[i]*b[i]
print(sum)
print(np.dot(a,b))
a = np.reshape(a, (1,3))
print(a)
b = np.reshape(b, (3,1))
print(b)
np.dot(a, b)
# Don't forget to normalize the kernel weights.
g2d_kernel_normalized = g2d_kernel / np.sum(g2d_kernel)
# Use scipy.signal.convolve2d for 2D convolutions
img_gaussian_smooth = sp.signal.convolve2d(I_gray, g2d_kernel_normalized, mode='same', boundary='fill')
plt.figure(figsize=(15,5))
plt.subplot(131)
plt.title('Image')
plt.imshow(I_gray,cmap='gray')
plt.subplot(132)
plt.title('Kernel')
plt.imshow(g2d_kernel_normalized)
plt.subplot(133)
plt.title('Smoothed image')
plt.imshow(img_gaussian_smooth,cmap='gray');
Constuct a $15 \times 15$ Gaussian kernel of variance $7$ and blur the image I_gray
. Also blur I_gray
with a $23 \times 23$ averaging kernel. Compare the results.
Solution:
# %load solutions/linear-filtering/solution_04.py
# Blurring the image I_gray with two different kernels
width = 15
# Construct & apply 15x15 Gaussian kernel
gaussian_kernel = gaussian2d(7, width)
I_gauss = sp.signal.convolve2d(I_gray, gaussian_kernel, mode='same', boundary='fill')
# Construct & apply 15x15 averaging kernel
width = 15
averaging_kernel = np.ones([width, width])/(width*width)
I_ave = sp.signal.convolve2d(I_gray, averaging_kernel, mode='same', boundary='fill')
# Generate plots
plt.figure(figsize=(10,5))
plt.subplot(121)
plt.title('Gaussian Kernel')
plt.xticks([])
plt.yticks([])
plt.imshow(I_gauss[100:228,100:228], cmap='gray')
plt.subplot(122)
plt.xticks([])
plt.yticks([])
plt.title('Averaging (Box) Kernel')
plt.imshow(I_ave[100:228,100:228], cmap='gray');
A $n$-dimensional filter that can be expressed as an outer product of $n$ 1-dimensional filters is called a separable filter.
Example
$$ \left[ \begin{array}{ccc} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{array} \right] = \left[ \begin{array}{c} 1 \\ 2 \\ 1 \end{array} \right] \left[ \begin{array}{ccc} 1 & 2 & 1 \end{array} \right] $$Where possible exploit separability to speed up convolutions. Say $w_k$ and $h_k$ denote kernel width and height, respectively, and $w$ and $h$ denote image width and height, respectively, then
For separable filters: $\mathcal{O}(w_k \times w \times h) + \mathcal{O}(h_k \times w \times h)$
For non-separable filters: $\mathcal{O}(w_k \times h_k \times w \times h)$
Recipe
(Step 1) Compute SVD, and check if only one singular value is non-zero
$$ F = \mathbf{U} \Sigma \mathbf{V}^T = \sum_i \sigma_i \mathbf{u}_i \mathbf{v}_i^T, $$where $\Sigma = \mathrm{diag}(\sigma_i)$.
(Step 2). Vertical and horizontal filters are: $\sqrt{\sigma_1}\mathbf{u}_1$ and $\sqrt{\sigma_1}\mathbf{v}_1^T$
F = np.array([1,2,1,2,4,2,1,2,1]).reshape(3,3)
print('F =\n{}'.format(F))
print(np.linalg.matrix_rank(F))
u, s, vh = np.linalg.svd(F)
print(f'Singular values of F are: {s}')
print(u)
print(vh)
if s[~np.isclose(s, 0)].shape == (1,):
print('Filter F_original is separable')
ind = np.where(np.isclose(s, 0) == False)[0][0]
s_1 = s[~np.isclose(s, ind)]
fv = np.sqrt(s_1)*u[:,ind].reshape(3,1)
fh = np.sqrt(s_1)*vh[ind,:].reshape(1,3)
print('fv:\n {}'.format(fv))
print('fh:\n {}'.format(fh))
print('F_reconstructed:\n {}'.format(np.dot(fv, fh)))
else:
print('Filter F is not separable')
S = np.array([2,3,3,3,5,5,4,4,6]).reshape(3,3)
print('S*F: \n{}'.format(sp.signal.convolve2d(S, F, mode='valid')))
Sv = sp.signal.convolve2d(S, fv, mode='valid')
print('Sv=S*fv: \n{}'.format(Sv))
print('Sv*fh: \n{}'.format(sp.signal.convolve2d(Sv, fh, mode='valid')))
Any linear shift-invariant filter can be represented as a convolution.
The Fourier transform of two convolved signals is the element-wise (sometimes called Hadamard) product of their individual Fourier transforms:
$$ \mathcal{F}\left\{f \ast h\right\} = \mathcal{F}\left\{f\right\}\mathcal{F}\left\{h\right\}. $$This also suggests that
$$ f \ast h = \mathcal{F}^{-1} \left\{ \mathcal{F}\left\{f\right\}\mathcal{F}\left\{h\right\} \right\}. $$This is an important result. Why? Consider the relative complexity of compute $f \ast h$ using the Fourier transform trick ...
Note: there are many subtleties to what is meant by "the Fourier Transform," i.e., for signals on an unbounded, continuous domain (defined by integrals) or signals on a discrete bounded domain (the "discrete Fourier transform") and so on.
dspillustrations.com
Given an image $X[i,j]$, we compute the integral image as follows:
$$ S[m,n]=\sum_{i \le m} \sum_{j \le n} X[i,j] $$(think of cumulative sums along both dimensions of an array).
X = np.array([1,2,3,-1,3,2,34,5,3,2,3,2,3,42,5,-3,1,4,98,3,1,2,3,2,5], dtype=np.float32).reshape(5,5)
print('X:\n {}'.format(X))
S = cv.integral(X)[1:,1:]
print('Integral image of X:\n {}'.format(S))
No matter the size of summing window, we can compute sum using 3 arithmetic operations