https://github.com/scikit-image/skimage-tutorials
You can do that in git using:
git clone --depth=1 https://github.com/scikit-image/skimage-tutorials
Images are numpy arrays
Images are represented in scikit-image
using standard numpy
arrays. This allows maximum inter-operability with other libraries in the scientific Python ecosystem, such as matplotlib
and scipy
.
Let's see how to build a grayscale image as a 2D array:
The same holds for "real-world" images:
A color image is a 3D array, where the last dimension has size 3 and represents the red, green, and blue channels:
These are just NumPy arrays. E.g., we can make a red square by using standard array slicing and manipulation:
Images can also include transparent regions by adding a 4th dimension, called an alpha layer.
Other shapes, and their meanings
Image type | Coordinates |
---|---|
2D grayscale | (row, column) |
2D multichannel | (row, column, channel) |
3D grayscale (or volumetric) | (plane, row, column) |
3D multichannel | (plane, row, column, channel) |
Displaying images using matplotlib
For more on plotting, see the Matplotlib documentation and pyplot API.
Data types and image values
In literature, one finds different conventions for representing image values:
scikit-image
supports both conventions--the choice is determined by the data-type of the array.
E.g., here, I generate two valid images:
The library is designed in such a way that any data-type is allowed as input, as long as the range is correct (0-1 for floating point images, 0-255 for unsigned bytes, 0-65535 for unsigned 16-bit integers).
You can convert images between different representations by using img_as_float
, img_as_ubyte
, etc.:
Your code would then typically look like this:
We recommend using the floating point representation, given that scikit-image
mostly uses that format internally.
Image I/O
Mostly, we won't be using input images from the scikit-image example data sets. Those images are typically stored in JPEG or PNG format. Since scikit-image operates on NumPy arrays, any image reader library that provides arrays will do. Options include imageio, matplotlib, pillow, etc.
scikit-image conveniently wraps many of these in the io
submodule, and will use whichever of the libraries mentioned above are installed:
We also have the ability to load multiple images, or multi-layer TIFF images:
Aside: enumerate
enumerate
gives us each element in a container, along with its position.
Exercise: draw the letter H
Define a function that takes as input an RGB image and a pair of coordinates (row, column), and returns a copy with a green letter H overlaid at those coordinates. The coordinates point to the top-left corner of the H.
The arms and strut of the H should have a width of 3 pixels, and the H itself should have a height of 24 pixels and width of 20 pixels.
Start with the following template:
Test your function like so:
Exercise: visualizing RGB channels
Display the different color channels of the image along (each as a gray-scale image). Start with the following template:
Now, take a look at the following R, G, and B channels. How would their combination look? (Write some code to confirm your intuition.)
Exercise: Convert to grayscale ("black and white")
The relative luminance of an image is the intensity of light coming from each point. Different colors contribute differently to the luminance: it's very hard to have a bright, pure blue, for example. So, starting from an RGB image, the luminance is given by:
Use Python 3.5's matrix multiplication, @
, to convert an RGB image to a grayscale luminance image according to the formula above.
Compare your results to that obtained with skimage.color.rgb2gray
.
Change the coefficients to 1/3 (i.e., take the mean of the red, green, and blue channels, to see how that approach compares with rgb2gray
).