Linear Filtering¶
Faisal Qureshi
Professor
Faculty of Science
Ontario Tech University
Oshawa ON Canada
http://vclab.science.ontariotechu.ca
Copyright information¶
© Faisal Qureshi
License¶
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Lesson Plan¶
- Linear Filtering in 1D
- Cross-correlation
- Convolution
- Gaussian filter
- Gaussian blurring
- Separability
- Relationship to Fourier transform
- Integral images
import numpy as np
import scipy as sp
from scipy import signal
import cv2 as cv
import matplotlib.pyplot as plt
# Load image of Van Gogh, convert from BGR to RGB image
I = cv.imread('data/van-gogh.jpg')
I = cv.cvtColor(I, cv.COLOR_BGR2RGB)
plt.figure(figsize=(7,7))
plt.imshow(I)
plt.xticks([])
plt.yticks([]);
Let's plot a single row of the red channel of image $I$, say, row 200.
row = 200
rowI = I[row,:,0]
plt.figure()
plt.plot(rowI,'-')
plt.plot(rowI,'r.');
Using kernel to compute mean¶
signal = rowI[0:5]
kernel = np.array([1,1,1,1,1]) / 5
print(rowI[0:5])
print('A = ', np.mean(rowI[0:5]))
print('B = ', np.dot(kernel, signal))
[133 127 120 119 125] A = 124.8 B = 124.80000000000001
Performing moving averages¶
Let's perform a moving average on this row. We assume window with half-width equal to $w$.
width = 5
signal = rowI
kernel = np.ones(2*width+1)/(2*width+1) # Array to represent the window (constant-valued)
result = np.ones(len(signal)-2*width) # Array to store computed moving averages
for i in range(len(result)):
centre = i + width
result[i] = np.dot(kernel, signal[centre-width:centre+width+1])
# Generate a plot of the original 1D signal and its average
plt.figure()
plt.plot(signal,'r.', alpha=0.3)
plt.plot(signal,'-', alpha=0.3)
plt.plot(np.arange(width, len(signal)-width),result,'k-');
Linear Filtering in 1D¶
- Signal: $\mathbf{f}$
- Kernel (also sometimes called a filter or a mask): $\mathbf{h}$
- Half-width of the kernel: $w$ (i.e., kernel width is $2 w + 1$)
Cross-Correlation¶
$$ CC(i) = \sum_{k \in [-w,w]} \mathbf{f}(i + k) \mathbf{h}(k) $$
Convolution¶
$$ (\mathbf{f} \ast \mathbf{k} )_i = \sum_{k \in [-w,w]} \mathbf{f}(i - k) \mathbf{h}(k) $$
Flipping kernel¶
Convolution and cross-correlation give the same results for symmetric kernels
kernel = np.array([3,1,2,1,3])
flipped_kernel = np.flip(kernel)
print('Symmetric kernel')
print(f'kernel={kernel}, flipped={flipped_kernel}')
kernel = np.array([1,1,2,2,3])
flipped_kernel = np.flip(kernel)
print('Asymmetric kernel')
print(f'kernel={kernel}, flipped={flipped_kernel}')
Symmetric kernel kernel=[3 1 2 1 3], flipped=[3 1 2 1 3] Asymmetric kernel kernel=[1 1 2 2 3], flipped=[3 2 2 1 1]
Exercise: computing a 1D convolution (from scratch)¶
Compute $\mathbf{f} \ast \mathbf{h}$ given
$$ \mathbf{f} = \left[ \begin{array}{cccccccc} 1 & 3 & 4 & 1 & 10 & 3 & 0 & 1 \end{array} \right] $$
and
$$ \mathbf{h} = \left[ \begin{array}{ccc} 1 & 0 & -1 \end{array}\right] $$
Solution:
# %load solutions/linear-filtering/solution_01.py
# Computing a 1D convolution from scratch
f = np.array([1,3,4,1,10,3,0,1])
h = np.array([1,0,-1])
width = 1
result = np.ones(len(f)-2*width) # Array to store computed moving averages
for i in range(len(result)):
centre = i + width
result[i] = np.dot(h[::-1], f[centre-width:centre+width+1])
print(f'signal f:\t\t{f}')
print(f'kernel h:\t\t{h}')
print(f'convolution (f*h):\t{result}')
signal f: [ 1 3 4 1 10 3 0 1] kernel h: [ 1 0 -1] convolution (f*h): [ 3. -2. 6. 2. -10. -2.]
Built-in Functions for 1D Convolution¶
NumPy, of course, has a builtin function numpy.convolve
for convolution that wraps the slow loop in compiled code. Notice that, with the definition above, if the original array has length $N$ and the convolution kernel has width $2w+1$, the convolved array has length $N-2w$. In numpy.convolve
, this behaviour is enforced using the keyword argument mode='valid'
.
Exercise: computing a 1D convolution (with convolve
)¶
Re-compute $\mathbf{f} \ast \mathbf{h}$ using numpy.convolve
given
$$ \mathbf{f} = \left[ \begin{array}{cccccccc} 1 & 3 & 4 & 1 & 10 & 3 & 0 & 1 \end{array} \right] $$
and
$$ \mathbf{h} = \left[ \begin{array}{ccc} 1 & 0 & -1 \end{array}\right]. $$
Solution:
# %load solutions/linear-filtering/solution_02.py
# Computing a 1D convolution using np.convolve
f = np.array([1, 3, 4, 1, 10, 3, 0, 1])
h = np.array([1, 0, -1])
r = np.convolve(f, h, mode='valid')
print(f'signal f:\t\t{f}')
print(f'kernel h:\t\t{h}')
print(f'convolution (f*h):\t{r}')
signal f: [ 1 3 4 1 10 3 0 1] kernel h: [ 1 0 -1] convolution (f*h): [ 3 -2 6 2 -10 -2]
Example: Gaussian smoothing for noise removal¶
Tasks:
- Construct a 1D (noisy) signal
- Filter this signal with a 1D Gaussian kernel to suppress the noise and improve the signal-to-noise ratio
np.random.seed(0)
n = 20
x = np.linspace(0, np.pi, n)
y = np.sin(x)
mu = 0.0
sigma = 0.2
y_noisy = y + np.random.normal(mu, sigma, n)
plt.plot(x,y,'.-', label='data', alpha=0.5)
plt.plot(x,y_noisy,'.-r', label='noisy data')
plt.xlim(-.4,3.5)
plt.ylim(-1,2)
plt.xticks([])
plt.yticks([])
plt.legend(['data', 'noisy data']);
Exercise: lets try to recover original signal (data) from corrupted signal (noisy data) using smoothing¶
Solution:
# %load solutions/linear-filtering/solution_03.py
f = y_noisy
width = 1
h = np.ones(2*width+1)/(2*width+1)
result = np.ones(len(f)-2*width) # Array to store computed moving averages
for i in range(len(result)):
centre = i + width
result[i] = np.dot(h[::-1], f[centre-width:centre+width+1])
plt.plot(x,y,'.-', label='data', alpha=0.3)
plt.plot(x,y_noisy,'.-r', label='noisy data', alpha=0.5)
plt.plot(x[width:-width],result,'.-b', label='noisy data')
plt.xlim(-.4,3.5)
plt.ylim(-1,2)
plt.xticks([])
plt.yticks([])
plt.legend(['data', 'noisy data']);
Note that lengths of x and result are not the same?
Gaussian in 1D¶
$$ G(\mu,\sigma,x) = \frac{1}{\sigma \sqrt{2 \pi}} \exp ^ {- \frac{(x-\mu)^2}{2 \sigma^2}} $$
Here $\mu$ and $\sigma$ refer to the mean and standard deviation of this Gaussian.
Aside: Mean and Standard Deviation¶
Given $n$ data points $\{x_1, x_2, \cdots, x_n \}$
$$ \mu = \mathrm{E}[x] = \frac{1}{n} \sum_{i=1}^n x_i $$
$$ \sigma^2 = \mathrm{E}[(x-\mu)^2] = \frac{1}{n} \sum_{i=1}^n (x_i - \mu)^2 $$
Evaluating Gaussian at a particular value.¶
mu = 2
sigma = 3
def g(mu, s, x):
exponent = -((x - mu)**2)/(2*s*s)
constant = 1/(s * np.sqrt(2 * np.pi))
return constant * np.exp( exponent )
g_ = np.empty(20)
for i in range(-10,10):
g_[i-10] = g(mu, sigma, i)
print(f'Gaussian with mu={mu} and sigma={sigma} evaluated at {i} is {g_[i-10]}')
Gaussian with mu=2 and sigma=3 evaluated at -10 is 4.461007525496179e-05 Gaussian with mu=2 and sigma=3 evaluated at -9 is 0.0001600902172069401 Gaussian with mu=2 and sigma=3 evaluated at -8 is 0.0005140929987637022 Gaussian with mu=2 and sigma=3 evaluated at -7 is 0.001477282803979336 Gaussian with mu=2 and sigma=3 evaluated at -6 is 0.003798662007932481 Gaussian with mu=2 and sigma=3 evaluated at -5 is 0.008740629697903166 Gaussian with mu=2 and sigma=3 evaluated at -4 is 0.017996988837729353 Gaussian with mu=2 and sigma=3 evaluated at -3 is 0.03315904626424957 Gaussian with mu=2 and sigma=3 evaluated at -2 is 0.05467002489199788 Gaussian with mu=2 and sigma=3 evaluated at -1 is 0.0806569081730478 Gaussian with mu=2 and sigma=3 evaluated at 0 is 0.10648266850745075 Gaussian with mu=2 and sigma=3 evaluated at 1 is 0.12579440923099774 Gaussian with mu=2 and sigma=3 evaluated at 2 is 0.1329807601338109 Gaussian with mu=2 and sigma=3 evaluated at 3 is 0.12579440923099774 Gaussian with mu=2 and sigma=3 evaluated at 4 is 0.10648266850745075 Gaussian with mu=2 and sigma=3 evaluated at 5 is 0.0806569081730478 Gaussian with mu=2 and sigma=3 evaluated at 6 is 0.05467002489199788 Gaussian with mu=2 and sigma=3 evaluated at 7 is 0.03315904626424957 Gaussian with mu=2 and sigma=3 evaluated at 8 is 0.017996988837729353 Gaussian with mu=2 and sigma=3 evaluated at 9 is 0.008740629697903166
plt.plot(g_);
Vectorized code for evaluating Gaussian over a range of values
def gaussian1d(mu, sig, n, normalized=True):
s = n//2
x = np.linspace(-s,s,n)
exponent_factor = np.exp(-np.power(x - mu, 2.) / (2 * np.power(sig, 2.)))
normalizing_factor = 1. if not normalized else 1./(sig * np.sqrt(2*np.pi))
return normalizing_factor * exponent_factor
n = 9
width = n // 2
sigma = 1.4
g1 = gaussian1d(0.0, sigma, n, normalized=True)
g2 = gaussian1d(0.0, sigma, n, normalized=False)
plt.plot(np.linspace(-width,width,n), g1, 'r-.', label='Normalized')
plt.plot(np.linspace(-width,width,n), g2, 'b-o', label='Not normalized')
plt.title('Gaussian kernels')
plt.legend(loc='center left');
# %load solutions/linear-filtering/solution_05.py
y_smoothed = np.convolve(y_noisy, g1, mode='valid')
plt.plot(x,y,'b', label='data', alpha=0.3)
plt.plot(x,y_noisy,'r', label='noisy data', alpha=0.5)
plt.plot(x[width:-width],y_smoothed,'k', label='smoothed data')
plt.xlim(-.4,3.5)
plt.ylim(-1,2)
plt.xticks([])
plt.yticks([])
plt.legend(loc='upper left')
<matplotlib.legend.Legend at 0x161e19ed0>
What do we do with the missing edge values?¶
We will return to this issue later.
Linear Filter in 2D¶
- Signal: $\mathbf{f}$
- Kernel (also sometimes called a filter or a mask): $\mathbf{h} \in \mathbb{R}^{(2w+1)\times(2h+1)}$
Cross-Correlation¶
$$ CC(i,j) = \sum_{\substack{k \in [-w,w] \\ l \in [-h,h]}} \mathbf{f}(i + k, j+l) \mathbf{h}(k,l) $$
Convolution¶
$$ (\mathbf{f} \ast \mathbf{k} )_{i,j} = = \sum_{\substack{k \in [-w,w] \\ l \in [-h,h]}} \mathbf{f}(i - k, j - l) \mathbf{h}(k,l) $$
Convolution is equivalent to flipping the filter (kernel) in both directions. Convolution and cross-correlation are the same for symmetric filters.
f = np.linspace(0,15,16).reshape((4,4))
h = np.ones((3,3))
print(f'f = \n{f}')
print(f'h = \n{h}')
f = [[ 0. 1. 2. 3.] [ 4. 5. 6. 7.] [ 8. 9. 10. 11.] [12. 13. 14. 15.]] h = [[1. 1. 1.] [1. 1. 1.] [1. 1. 1.]]
Exercise¶
Compute $(\mathbf{f} \ast \mathbf{k} )_{i,j}$ where $(i,j) = (1,2)$.
Solution:
# Your answer goes here
foo = f[:3,1:]
print(foo)
print(np.reshape(foo, (9)))
print(np.reshape(h, (9)))
print(np.dot(np.reshape(foo, (9)), np.reshape(h, (9))))
[[ 1. 2. 3.] [ 5. 6. 7.] [ 9. 10. 11.]] [ 1. 2. 3. 5. 6. 7. 9. 10. 11.] [1. 1. 1. 1. 1. 1. 1. 1. 1.] 54.0
2D Averaging Kernels¶
A 3-by-3 summing kernel
average_3_by_3 = np.array([[1,1,1],
[1,1,1],
[1,1,1]], dtype='float32')
Exercise¶
Construct a 3-by-3 averaging kernel.
Solution:
average_3_by_3 = average_3_by_3/np.sum(average_3_by_3)
print(f'average_3_by_3 = \n{average_3_by_3}')
average_3_by_3 = [[0.11111111 0.11111111 0.11111111] [0.11111111 0.11111111 0.11111111] [0.11111111 0.11111111 0.11111111]]
Convolution with averaging kernels¶
average_3_by_3 = np.ones(9).reshape(3,3) / 9
average_5_by_5 = np.ones(25).reshape(5,5) / 25
average_7_by_7 = np.ones(49).reshape(7,7) / 49
print(f'average_5_by_5 = \n{average_5_by_5}')
average_5_by_5 = [[0.04 0.04 0.04 0.04 0.04] [0.04 0.04 0.04 0.04 0.04] [0.04 0.04 0.04 0.04 0.04] [0.04 0.04 0.04 0.04 0.04] [0.04 0.04 0.04 0.04 0.04]]
I_gray = cv.cvtColor(I, cv.COLOR_BGR2GRAY)
I3 = sp.signal.convolve2d(I_gray, average_3_by_3, mode='same')
I5 = sp.signal.convolve2d(I_gray, average_5_by_5, mode='same')
I7 = sp.signal.convolve2d(I_gray, average_7_by_7, mode='same')
plt.figure(figsize=(20,10))
plt.subplot(241)
plt.imshow(I_gray, cmap='gray')
plt.subplot(242)
plt.imshow(I3, cmap='gray')
plt.subplot(243)
plt.imshow(I5, cmap='gray')
plt.subplot(244)
plt.imshow(I7, cmap='gray')
plt.subplot(245)
plt.imshow(I_gray[215:290,175:250], cmap='gray')
plt.subplot(246)
plt.imshow(I3[215:290,175:250], cmap='gray')
plt.subplot(247)
plt.imshow(I5[215:290,175:250], cmap='gray')
plt.subplot(248)
plt.imshow(I7[215:290,175:250], cmap='gray');
A 2D Kernel that shifts images by two pixels¶
shift_right = np.array([[0,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,1],
[0,0,0,0,0],
[0,0,0,0,0]], dtype='float32')
shift_left = np.array([[0,0,0,0,0],
[0,0,0,0,0],
[1,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0]], dtype='float32')
A 2D Kernel for finding edges pixels¶
sobel_x = np.array([[1,2,1],
[0,0,0],
[-1,-2,-1]], dtype='float32')
E_horizontal = sp.signal.convolve2d(I_gray, sobel_x, mode='same')
plt.figure(figsize=(20,10))
plt.subplot(121)
plt.imshow(I_gray, cmap='gray')
plt.subplot(122)
plt.imshow(E_horizontal, cmap='gray');
sobel_y = np.array([[1,0,-1],
[2,0,-2],
[1,0,-1]], dtype='float32')
E_vertical = sp.signal.convolve2d(I_gray, sobel_y, mode='same')
plt.figure(figsize=(20,10))
plt.subplot(121)
plt.imshow(I_gray, cmap='gray')
plt.subplot(122)
plt.imshow(E_vertical, cmap='gray')
<matplotlib.image.AxesImage at 0x163a65e50>
Multivariate Gaussian¶
Gaussian function in $k$ dimensions is $$ G(\mathbf{\mu},\Sigma,\mathbf{x}) = \frac{\exp ^{-\frac{1}{2} (\mathbf{x} - \mathbf{\mu})^T \Sigma^{-1} (\mathbf{x} - \mathbf{\mu})}}{\sqrt{ (2 \pi)^{k} \det(\Sigma) }} $$ where $\Sigma\in\mathbb{R}^{k\times k}$ is the (symmetric positive definite) covariance matrix, and $\mu\in\mathbb{R}^{k\times1}$ is the mean.
Aside: Mean and Covariance¶
Given an $n \times d$ data matrix
$$ \mathbf{X} = \left[ \begin{array}{cccc} x_{11} & x_{12} & \cdots & x_{1d} \\ x_{21} & x_{22} & \cdots & x_{2d} \\ \vdots & \vdots & & \vdots \\ x_{n1} & x_{n2} & \cdots & x_{nd} \\ \end{array} \right], $$
where each row denotes a data point and each column includes values from all samples for a particular dimension.
Then
$$ \mu_j = \frac{1}{n} \sum_{i=1}^n x_{ij} $$
and covariance is a $d \times d$ matrix whose entries are computed as follows
$$ Cov_{(l,m)} = \frac{1}{n} \sum_{i=1}^n (x_{il} - \mu_l) (x_{im} - \mu_m) $$
# Evaluates 2D Gaussian on xy grid.
# xy is two channel 2D matrix. The first channel stores x coordinates
# [-1 0 1]
# [-1 0 1]
# [-1 0 1]
# and the second channel stores y coordinates
# [-1 -1 -1]
# [0 0 0]
# [1 1 1]
# So then we can pick out an xy value using xy[i,j,:].
# Check out gaussian2_n() to see how you can construct such
# an xy using numpy
#
# For gaussian2_xy() and gaussian2_n() methods
# mean is a 2x1 vector and cov is a 2x2 matrix
def gaussian2_xy(mean, cov, xy):
invcov = np.linalg.inv(cov)
results = np.ones([xy.shape[0], xy.shape[1]])
for x in range(0, xy.shape[0]):
for y in range(0, xy.shape[1]):
v = xy[x,y,:].reshape(2,1) - mean
results[x,y] = np.dot(np.dot(np.transpose(v), invcov), v)
results = np.exp( - results / 2 )
return results
def gaussian2_n(mean, cov, n):
s = n//2
x = np.linspace(-s,s,n)
y = np.linspace(-s,s,n)
xc, yc = np.meshgrid(x, y)
xy = np.zeros([n, n, 2])
xy[:,:,0] = xc
xy[:,:,1] = yc
return gaussian2_xy(mean, cov, xy), xc, yc
def gaussian2d(var, n):
mean = np.array([0, 0])
mean = mean.reshape(2,1)
cov = np.array([[var,0],[0,var]])
k, xc, yc = gaussian2_n(mean, cov, n)
return k
n = 111
mean = np.array([0, 0])
mean = mean.reshape(2,1)
cov = np.array([[7,0],[0,7]])
g2d_kernel, xc, yc = gaussian2_n(mean, cov, 17)
from mpl_toolkits.mplot3d import Axes3D
fig = plt.figure()
ax = plt.axes(projection='3d')
surf = ax.plot_surface(xc, yc, g2d_kernel,rstride=1, cstride=1, cmap='coolwarm',
linewidth=0, antialiased=False)
/var/folders/__/7_6vnl8x6tn6gq0nhwd6j4tc0000gn/T/ipykernel_71837/3106178354.py:23: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.) results[x,y] = np.dot(np.dot(np.transpose(v), invcov), v)
a = np.array([1,2,3])
b = np.array([4,5,6])
sum = 0
for i in range(3):
sum = sum + a[i]*b[i]
print(sum)
print(np.dot(a,b))
a = np.reshape(a, (1,3))
print(a)
b = np.reshape(b, (3,1))
print(b)
np.dot(a, b)
32 32 [[1 2 3]] [[4] [5] [6]]
array([[32]])
# Don't forget to normalize the kernel weights.
g2d_kernel_normalized = g2d_kernel / np.sum(g2d_kernel)
# Use scipy.signal.convolve2d for 2D convolutions
img_gaussian_smooth = sp.signal.convolve2d(I_gray, g2d_kernel_normalized, mode='same', boundary='fill')
plt.figure(figsize=(15,5))
plt.subplot(131)
plt.title('Image')
plt.imshow(I_gray,cmap='gray')
plt.subplot(132)
plt.title('Kernel')
plt.imshow(g2d_kernel_normalized)
plt.subplot(133)
plt.title('Smoothed image')
plt.imshow(img_gaussian_smooth,cmap='gray');
Gaussian Filter¶
- We often use the following approximation of the Gaussian function
$$ \frac{1}{16} \left[ \begin{array}{ccc} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{array} \right] $$
- Gaussian function has infinite support, but discrete filters use finite kernels
- Variance controls how broad or peaky the filter is
- Blurs the image, i.e., removes "high-frequency" components from the image (acts as low-pass filter)
- Convolution with itself is a Gaussian
- Convolving twice with Gaussian kernel of width $\sigma$ is the same as convolving once with kernel of width $\sigma \sqrt{2}$
- Applying a Gaussian filter with variance $\sigma_1$, followed by applying a Guassian filter with variance $\sigma_2$ is the same as applying once with Gaussian filter with variance $\sqrt{\sigma_1^2 + \sigma_2^2}$
- All values are positive
- Values sum to $1$? Why is this relevant?
- This size of the filter, plus its variance, determines the extent of smoothing
Exercise: applying 2D Gaussian & averaging kernels¶
Constuct a $15 \times 15$ Gaussian kernel of variance $7$ and blur the image I_gray
. Also blur I_gray
with a $23 \times 23$ averaging kernel. Compare the results.
Solution:
# %load solutions/linear-filtering/solution_04.py
# Blurring the image I_gray with two different kernels
width = 15
# Construct & apply 15x15 Gaussian kernel
gaussian_kernel = gaussian2d(7, width)
I_gauss = sp.signal.convolve2d(I_gray, gaussian_kernel, mode='same', boundary='fill')
# Construct & apply 15x15 averaging kernel
width = 15
averaging_kernel = np.ones([width, width])/(width*width)
I_ave = sp.signal.convolve2d(I_gray, averaging_kernel, mode='same', boundary='fill')
# Generate plots
plt.figure(figsize=(10,5))
plt.subplot(121)
plt.title('Gaussian Kernel')
plt.xticks([])
plt.yticks([])
plt.imshow(I_gauss[100:228,100:228], cmap='gray')
plt.subplot(122)
plt.xticks([])
plt.yticks([])
plt.title('Averaging (Box) Kernel')
plt.imshow(I_ave[100:228,100:228], cmap='gray');
/var/folders/__/7_6vnl8x6tn6gq0nhwd6j4tc0000gn/T/ipykernel_71837/3106178354.py:23: DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.) results[x,y] = np.dot(np.dot(np.transpose(v), invcov), v)
Separability¶
A $n$-dimensional filter that can be expressed as an outer product of $n$ 1-dimensional filters is called a separable filter.
Example
$$ \left[ \begin{array}{ccc} 1 & 2 & 1 \\ 2 & 4 & 2 \\ 1 & 2 & 1 \end{array} \right] = \left[ \begin{array}{c} 1 \\ 2 \\ 1 \end{array} \right] \left[ \begin{array}{ccc} 1 & 2 & 1 \end{array} \right] $$
Convolutions with Separable Filters (in 2D)¶
- Step 1: Perform row-wise convolution with horizontal filter
- Step 2: Perform column-wise convolution of the result obtained in Step 1 with the vertical filter
Computational considerations¶
Where possible exploit separability to speed up convolutions. Say $w_k$ and $h_k$ denote kernel width and height, respectively, and $w$ and $h$ denote image width and height, respectively, then
For separable filters: $\mathcal{O}(w_k \times w \times h) + \mathcal{O}(h_k \times w \times h)$
For non-separable filters: $\mathcal{O}(w_k \times h_k \times w \times h)$
Separable Filters¶
- Use Singular Value Decomposition (SVD) to determine if a filter is separable. If only one singular value is non-zero then the filter is separable.
Recipe
(Step 1) Compute SVD, and check if only one singular value is non-zero
$$ F = \mathbf{U} \Sigma \mathbf{V}^T = \sum_i \sigma_i \mathbf{u}_i \mathbf{v}_i^T, $$
where $\Sigma = \mathrm{diag}(\sigma_i)$.
(Step 2). Vertical and horizontal filters are: $\sqrt{\sigma_1}\mathbf{u}_1$ and $\sqrt{\sigma_1}\mathbf{v}_1^T$
Code example¶
F = np.array([1,2,1,2,4,2,1,2,1]).reshape(3,3)
print('F =\n{}'.format(F))
print(np.linalg.matrix_rank(F))
u, s, vh = np.linalg.svd(F)
print(f'Singular values of F are: {s}')
print(u)
print(vh)
F = [[1 2 1] [2 4 2] [1 2 1]] 1 Singular values of F are: [6.00000000e+00 5.60898698e-16 3.88443959e-49] [[-4.08248290e-01 9.12870929e-01 3.65318184e-17] [-8.16496581e-01 -3.65148372e-01 -4.47213595e-01] [-4.08248290e-01 -1.82574186e-01 8.94427191e-01]] [[-0.40824829 -0.81649658 -0.40824829] [-0.91287093 0.36514837 0.18257419] [-0. 0.4472136 -0.89442719]]
if s[~np.isclose(s, 0)].shape == (1,):
print('Filter F_original is separable')
ind = np.where(np.isclose(s, 0) == False)[0][0]
s_1 = s[~np.isclose(s, ind)]
fv = np.sqrt(s_1)*u[:,ind].reshape(3,1)
fh = np.sqrt(s_1)*vh[ind,:].reshape(1,3)
print('fv:\n {}'.format(fv))
print('fh:\n {}'.format(fh))
print('F_reconstructed:\n {}'.format(np.dot(fv, fh)))
else:
print('Filter F is not separable')
Filter F_original is separable fv: [[-1.] [-2.] [-1.]] fh: [[-1. -2. -1.]] F_reconstructed: [[1. 2. 1.] [2. 4. 2.] [1. 2. 1.]]
S = np.array([2,3,3,3,5,5,4,4,6]).reshape(3,3)
print('S*F: \n{}'.format(sp.signal.convolve2d(S, F, mode='valid')))
Sv = sp.signal.convolve2d(S, fv, mode='valid')
print('Sv=S*fv: \n{}'.format(Sv))
print('Sv*fh: \n{}'.format(sp.signal.convolve2d(Sv, fh, mode='valid')))
S*F: [[65]] Sv=S*fv: [[-12. -17. -19.]] Sv*fh: [[65.]]
Dealing with Missing Values¶
- Set missing entries to a particular value, say $0$
- Repeat boundary entries
- Wrap around (this is useful to create an infinite domain)
- Do nothing. This is often not done, since the output size is not the same as the input size, which creates a host of practical issues.
Linear Filtering¶
- Linearity
$$ \mathtt{filter}(\alpha_1 f_1 + \alpha_2 f_2) = \alpha_1 \mathtt{filter}(f_1) + \alpha_2 \mathtt{filter}(f_2) $$
- Shift-invariance
$$ \mathtt{filter}(\mathtt{shift}(f)) = \mathtt{shift}(\mathtt{filter}(f)) $$
Any linear shift-invariant filter can be represented as a convolution.
Properties of Convolution¶
- Commulative: $a \ast b = b \ast a$
- Associative: $a \ast (b \ast c) = (a \ast b) \ast c$
- Distributes over addition: $a \ast (b+c) = a \ast b + a \ast c$
- Scalars factors out: $ka \ast b = a \ast kb = k (a \ast b)$
- Identity: $ a \ast e = a$, where $e$ is unit impulse
Relationship to Fourier Transform¶
The Fourier transform of two convolved signals is the element-wise (sometimes called Hadamard) product of their individual Fourier transforms:
$$ \mathcal{F}\left\{f \ast h\right\} = \mathcal{F}\left\{f\right\}\mathcal{F}\left\{h\right\}. $$
This also suggests that
$$ f \ast h = \mathcal{F}^{-1} \left\{ \mathcal{F}\left\{f\right\}\mathcal{F}\left\{h\right\} \right\}. $$
This is an important result. Why? Consider the relative complexity of compute $f \ast h$ using the Fourier transform trick ...
Note: there are many subtleties to what is meant by "the Fourier Transform," i.e., for signals on an unbounded, continuous domain (defined by integrals) or signals on a discrete bounded domain (the "discrete Fourier transform") and so on.
- For computational comparisons in Python, please consult The Convolution Theorem & Application Examples at
dspillustrations.com
- An excellent technical reference is Briggs & Henson's The DFT: An Owner's Manual for the Discrete Fourier Transform that is very precise, comprehensive, and readable.
(Aside) Integral Images¶
Given an image $X[i,j]$, we compute the integral image as follows:
$$ S[m,n]=\sum_{i \le m} \sum_{j \le n} X[i,j] $$
(think of cumulative sums along both dimensions of an array).
X = np.array([1,2,3,-1,3,2,34,5,3,2,3,2,3,42,5,-3,1,4,98,3,1,2,3,2,5], dtype=np.float32).reshape(5,5)
print('X:\n {}'.format(X))
S = cv.integral(X)[1:,1:]
print('Integral image of X:\n {}'.format(S))
X: [[ 1. 2. 3. -1. 3.] [ 2. 34. 5. 3. 2.] [ 3. 2. 3. 42. 5.] [-3. 1. 4. 98. 3.] [ 1. 2. 3. 2. 5.]] Integral image of X: [[ 1. 3. 6. 5. 8.] [ 3. 39. 47. 49. 54.] [ 6. 44. 55. 99. 109.] [ 3. 42. 57. 199. 212.] [ 4. 45. 63. 207. 225.]]
Uses of Integral Images¶
No matter the size of summing window, we can compute sum using 3 arithmetic operations