Image filtering
Image filtering theory
Filtering is one of the most basic and common image operations in image processing. You can filter an image to remove noise or to enhance features; the filtered image could be the desired result or just a preprocessing step. Regardless, filtering is an important topic to understand.
Local filtering
The "local" in local filtering simply means that a pixel is adjusted by values in some surrounding neighborhood. These surrounding elements are identified or weighted based on a "footprint", "structuring element", or "kernel".
Let's go to back to basics and look at a 1D step-signal
Now add some noise to this signal:
The simplest way to recover something that looks a bit more like the original signal is to take the average between neighboring "pixels":
What happens if we want to take the three neighboring pixels? We can do the same thing:
For averages of more points, the expression keeps getting hairier. And you have to worry more about what's going on in the margins. Is there a better way?
It turns out there is. This same concept, nearest-neighbor averages, can be expressed as a convolution with an averaging kernel. Note that the operation we did with smooth_signal3
can be expressed as follows:
Create an output array called
smooth_signal3
, of the same length asnoisy_signal
.At each element in
smooth_signal3
starting at point 1, and ending at point -2, place the average of the sum of: 1/3 of the element to the left of it innoisy_signal
, 1/3 of the element at the same position, and 1/3 of the element to the right.discard the leftmost and rightmost elements.
This is called a convolution between the input image and the array [1/3, 1/3, 1/3]
. (We'll give a more in-depth explanation of convolution in the next section).
The advantage of convolution is that it's just as easy to take the average of 11 points as 3:
Of course, to take the mean of 11 values, we have to move further and further away from the edges, and this starts to be noticeable. You can use mode='same'
to pad the edges of the array and compute a result of the same size as the input:
But now we see edge effects on the ends of the signal...
This is because mode='same'
actually pads the signal with 0s and then applies mode='valid'
as before.
Exercise Look up the documentation of scipy.ndimage.convolve
. Apply the same convolution, but using a different mode=
keyword argument to avoid the edge effects we see here.
A difference filter
Let's look again at our simplest signal, the step signal from before:
Exercise: Can you predict what a convolution with the kernel [-1, 0, 1]
does? Try thinking about it before running the cells below.
(For technical signal processing reasons, convolutions actually occur "back to front" between the input array and the kernel. Correlations occur in the signal order, so we'll use correlate from now on.)
Whenever neighboring values are close, the filter response is close to 0. Right at the boundary of a step, we're subtracting a small value from a large value and and get a spike in the response. This spike "identifies" our edge.
Commutativity and assortativity of filters
What if we try the same trick with our noisy signal?
Oops! We lost our edge!
But recall that we smoothed the signal a bit by taking its neighbors. Perhaps we can do the same thing here. Actually, it turns out that we can do it in any order, so we can create a filter that combines both the difference and the mean:
Note: we use np.convolve
here, because it has the option to output a wider result than either of the two inputs.
Now we can use this to find our edge even in a noisy signal:
Exercise: The Gaussian filter with variance is given by:
Create this filter (for example, with width 9, center 4, sigma 1). (Plot it)
Convolve it with the difference filter (with appropriate mode). (Plot the result)
Convolve it with the noisy signal. (Plot the result)
Local filtering of images
Now let's apply all this knowledge to 2D images instead of a 1D signal. Let's start with an incredibly simple image:
This gives the values below:
and looks like a white square centered on a black square:
The mean filter
For our first example of a filter, consider the following filtering array, which we'll call a "mean kernel". For each pixel, a kernel defines which neighboring pixels to consider when filtering, and how much to weight those pixels.
Now, let's take our mean kernel and apply it to every pixel of the image.
Applying a (linear) filter essentially means:
Center a kernel on a pixel
Multiply the pixels under that kernel by the values in the kernel
Sum all the those results
Replace the center pixel with the summed result
This process is known as convolution.
Let's take a look at the numerical result:
The meaning of "mean kernel" should be clear now: Each pixel was replaced with the mean value within the 3x3 neighborhood of that pixel. When the kernel was over n
bright pixels, the pixel in the kernel's center was changed to n/9 (= n * 0.111). When no bright pixels were under the kernel, the result was 0.
This filter is a simple smoothing filter and produces two important results:
The intensity of the bright pixel decreased.
The intensity of the region near the bright pixel increased.
Let's see a convolution in action.
(Execute the following cell, but don't try to read it; its purpose is to generate an example.)
Incidentally, the above filtering is the exact same principle behind the convolutional neural networks, or CNNs, that you might have heard much about over the past few years. The only difference is that while above, the simple mean kernel is used, in CNNs, the values inside the kernel are learned to find a specific feature, or accomplish a specific task. Time permitting, we'll demonstrate this in an exercise at the end of the notebook.
Slight aside:
Note that all the values of the kernel sum to 1. Why might that be important?
Downsampled image
Let's consider a real image now. It'll be easier to see some of the filtering we're doing if we downsample the image a bit. We can slice into the image using the "step" argument to sub-sample it (don't scale images using this method for real work; use skimage.transform.rescale
):
Here we use a step of 10, giving us every tenth column and every tenth row of the original image. You can see the highly pixelated result on the right.
We are actually going to be using the pattern of plotting multiple images side by side quite often, so we are going to make the following helper function:
Mean filter on a real image
Now we can apply the filter to this downsampled image:
Comparing the filtered image to the pixelated image, we can see that this filtered result is smoother: Sharp edges (which are just borders between dark and bright pixels) are smoothed because dark pixels reduce the intensity of neighboring pixels and bright pixels do the opposite.
Essential filters
If you read through the last section, you're already familiar with the essential concepts of image filtering. But, of course, you don't have to create custom filter kernels for all of your filtering needs. There are many standard filter kernels pre-defined from half a century of image and signal processing.
Gaussian filter
The classic image filter is the Gaussian filter. This is similar to the mean filter, in that it tends to smooth images. The Gaussian filter, however, doesn't weight all values in the neighborhood equally. Instead, pixels closer to the center are weighted more than those farther away.
For the Gaussian filter, sigma
, the standard deviation, defines the size of the neighborhood.
For a real image, we get the following:
This doesn't look drastically different than the mean filter, but the Gaussian filter is typically preferred because of the distance-dependent weighting, and because it does not have any sharp transitions (consider what happens in the Fourier domain!). For a more detailed image and a larger filter, you can see artifacts in the mean filter since it doesn't take distance into account:
(Above, we've tweaked the size of the structuring element used for the mean filter and the standard deviation of the Gaussian filter to produce an approximately equal amount of smoothing in the two results.)
Incidentally, for reference, let's have a look at what the Gaussian filter actually looks like. Technically, the value of the kernel at a pixel that is rows and cols from the center is:
Practically speaking, this value is pretty close to zero for values more than away from the center, so practical Gaussian filters are truncated at about :
Exercise (Chapter 0 reminder!) Plot the profile of the gaussian kernel at its midpoint, i.e. the values under the line shown here:
Basic edge filtering
For images, edges are boundaries between light and dark values. The detection of edges can be useful on its own, or it can be used as preliminary step in other algorithms (which we'll see later).
Difference filters in 2D
For images, you can think of an edge as points where the gradient is large in one direction. We can approximate gradients with difference filters.
Exercise:
Add a horizontal kernel to the above example to also compute the horizontal gradient,
Compute the magnitude of the image gradient at each point:
Sobel edge filter
The Sobel filter, the most commonly used edge filter, should look pretty similar to what you developed above. Take a look at the vertical and horizontal components of the Sobel kernel to see how they differ from your earlier implementation:
Notice that the size of the output matches the input, and the edges aren't preferentially shifted to a corner of the image. Furthermore, the weights used in the Sobel filter produce diagonal edges with reponses that are comparable to horizontal or vertical edges.
Like any derivative, noise can have a strong impact on the result:
Smoothing is often used as a preprocessing step in preparation for feature detection and image-enhancement operations because sharp features can distort results.
Notice how the edges look more continuous in the smoothed image.
Exercise: the simplest neural network. Let's pretend we have an image and a "ground truth" image of what we want to detect:
Can we use machine learning to find a 3x3 convolutional filter that recovers this target?
use
skimage.util.view_as_windows
andnp.reshape
to view the image as a set of (approximately)npixels
3x3 patches. (Hint: why is it only approximate? Think ofmode=valid
convolutions.)use
np.reshape
again to see it asnpixels
"linear" patches of 9 pixels.Now you have an
(npixels, 9)
"feature" matrix,X
.Use slicing and
np.ravel
to get annpixels
-length array of target values.Use
sklearn.linear_model.LogisticRegression
to learn the relationship between our pixel neighborhoods (of size 9) and the target.Look at your
model.coef_
. How do they compare to the Sobel coefficients?
Denoising filters
At this point, we make a distinction. The earlier filters were implemented as a linear dot-product of values in the filter kernel and values in the image. The following kernels implement an arbitrary function of the local image neighborhood. Denoising filters in particular are filters that preserve the sharpness of edges in the image.
As you can see from our earlier examples, mean and Gaussian filters smooth an image rather uniformly, including the edges of objects in an image. When denoising, however, you typically want to preserve features and just remove noise. The distinction between noise and features can, of course, be highly situation-dependent and subjective.
Median Filter
The median filter is the classic edge-preserving filter. As the name implies, this filter takes a set of pixels (i.e. the pixels within a kernel or "structuring element") and returns the median value within that neighborhood. Because regions near a sharp edge will have many dark values and many light values (but few values in between) the median at an edge will most likely be either light or dark, rather than some value in between. In that way, we don't end up with edges that are smoothed.
This difference is more noticeable with a more detailed image.
Notice how the edges of coins are preserved after using the median filter.
Further reading
scikit-image
also provides more sophisticated denoising filters:
Take a look at this neat feature merged last year: