Project: scikit

Views: ¹⁹

Kernel: Python 3 (Ubuntu Linux)

Computational Exploration: Timing E. coli

In this Computational Exploration, we make a crude estimate of the mean division time by using a time laplse video of dividing bacteria like that shown in figure below.

In [0]:

Here a snapshot of the growing bacterial culture is shown every 30 minutes. The consecutive images show successive bacterial growth and division process.

The idea in this exercise is to write code that examines the images and determines the total area of the cells as a function of time.

Below is shown how to read one example snapshot, and identify area corresponding to the bacterial cells.

First install the package scikit-image

In [0]:

In [0]:

Example

In [1]:

import skimage
from numpy import*
from scipy.optimize import curve_fit

If no errors are shown by the execution of the above statement then proceed ahead.

In [2]:

%pylab inline
import skimage.io as skio

Populating the interactive namespace from numpy and matplotlib

Reading image

Following images are available.

In [3]:

!ls images/

ls: cannot access 'images/': No such file or directory

We use gc4.jpg for example purposes.

In [4]:

image=[0,0,0,0,0,0]
for i in range (6):
    image[i] = skio.imread('gc'+str(i+1)+'.png')
imshow(image[3]);

Segmenting E. coli

Segmentation, the act of extracting interesting features from an image, is probably the most common taks in image processing. Your brain is very good at this. You probably recognized E. coli right away? But how can we explicitly define features of E. coli that isolate it from the other regions in this image? We can essentially define a brightness filter that will keep only pixels in a certain color or grayscale range. In doing so, it becomes straightforward to isolate E. coli. This process is called thresholding.

Let us plot the histogram of image brighness.

In [5]:

scan = skimage.color.rgb2gray(image[3])
'''plt.hist(scan)
plt.xlabel('Brightness (0=black, 1 = white)')
plt.ylabel('Count')
plt.title('Distribution of pixel brightness');'''

"plt.hist(scan)\nplt.xlabel('Brightness (0=black, 1 = white)')\nplt.ylabel('Count')\nplt.title('Distribution of pixel brightness');"

Identifying E. coli

Which of these regions corresspond to the brightness of E. coli?

Is it the one between 0.5 to 0.6?

Let us check.

In [6]:

mask = (scan > 0.1) & (scan < 0.45)
print(mask)

[[False False False ..., False False False]
 [False False False ..., False False False]
 [False False False ..., False False False]
 ..., 
 [False False False ..., False False False]
 [False False False ..., False False False]
 [False False False ..., False False False]]

This returns True for any pixel that falls into the brightness values of 0.7 and 0.77, and False everywhere else. This process of boolean indexing is also referred to as masking. Let's visualize our mask:

In [7]:

plt.imshow(mask, cmap='gray');

We were wrong. That brightness level did not correspond to that of E. coli, but to that shown in white in the image above.

Make another try.

In [15]:

mask = (scan < 0.443) & (scan > 0.0)
plt.imshow(mask, cmap='gray')

<matplotlib.image.AxesImage at 0x7fbd7e24cf98>

Measuring area

In [16]:

'Area: %s square pixels' % np.sum(mask)

'Area: 5788 square pixels'

The total number of pixels (px) in the image is just the x dimension of the image times the y dimension.

In [10]:

px_x = image[5].shape[0]
px_y = image[5].shape[1]
netarea = px_x * px_y

print ('X px: %s   Y px: %s   Total Area: %s px^2') 
print(px_x, px_y, netarea)

X px: %s   Y px: %s   Total Area: %s px^2
141 207 29187

In [11]:

print ('Percent of image occupied by E. coli: %.2f%%') % print(100.0 * np.sum(mask) / netarea)

Percent of image occupied by E. coli: %.2f%%
9.10679412067

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-9c7c7e58c6d2> in <module>()
----> 1 print ('Percent of image occupied by E. coli: %.2f%%') % print(100.0 * np.sum(mask) / netarea)

TypeError: unsupported operand type(s) for %: 'NoneType' and 'NoneType'

The example ends here. Your turn now.

In [12]:

area=[]
for i in range(6):
    scan = skimage.color.rgb2gray(image[i])
    mask = (skimage.color.rgb2gray(image[i]) < 0.44) & (skimage.color.rgb2gray(image[i]) > 0.0)
    area.append(sum(mask))
print(area)

[549, 927, 1642, 2652, 4185, 5769]

Find the doubling time of E. coli

If we assume that the cells are doubling steadily, this implies that the number of cells as a function of time can be written as $N(t) = N_0 e^{kt}$ . On the assumption that all cells have the same area, by multiplying both sides by the area er cell $A_0$ , we can rewrite this euation oas $A(t) = A_0 e^{kt}$ . The time constant for doubling can be determined by noting that $2N_0 = N_0 e^{kt_{double}}$ , which results in $t_{double} = (1/k) \ln 2$ . The rate constant, in turn, can be read off a graph you will obtain.

Try fitting a curve to the data points, and get better estimate of $k$ .

In [14]:

area=[]
for i in range(6):
    scan = skimage.color.rgb2gray(image[i])
    mask = (scan < 0.443)
    area.append(sum(mask))
print(area)
t=linspace(0,2.5,6)
plot(t,area,'b*', label = 'data set')
N=5000
dif=np.zeros([N])
for i in range(N):
    k=ones(6)*i/N*1.0
    ex=area[0]*exp(k*t)
    dif[i]=sum((ex-area)**2)
k=argmin(dif)/N
print('Decay const')
disp(k)
thalf=log(2)/k*60
print('doubling time')
print("Doubling time in minutes= ", thalf)

decay=area[0]*pow(exp(t),k)
plot(t,decay,'g', label = 'fitting curve')

[551, 928, 1645, 2658, 4195, 5788]
Decay const
0.9614
doubling time
Doubling time in minutes=  43.2586133073

[<matplotlib.lines.Line2D at 0x7fbd7e26d518>]

Growth curves

Numerically solve the following differential equation: $\frac{dN}{dt} = kN$ and plot $N$ as a function of $t$ it. Take $N(t=0) = N_0 = 1$ , and $k=1$ .

In [29]:

[<matplotlib.lines.Line2D at 0x7fb3e36c1da0>]

Numerically, solve the following differential equation: $\frac{dN}{dt} = kN\left(1-\frac{N}{K}\right)$ and plot $N$ as a function of $t$ it. Take $N(t=0) = N_0 = 1$ , $k=1$ , and $K=1$ .

In [30]:

[<matplotlib.lines.Line2D at 0x7fb3e3626cf8>]

Compare the growth curves in the above two cases.