Kernel: Python 3 (Anaconda 5)

A basic aggregation model -- Vicsek

In this document, we'll implement the basic Vicsek aggregation model and demonstrate how to code in Python along the way.

First, let's take the model description from Topaz, Chad M., Lori Ziegelmeier, and Tom Halverson. 2015. “Topological Data Analysis of Biological Aggregation Models.” PLOS ONE 10 (5): e0126383. https://doi.org/10/f7mp7k.

Each agent ("synthetic cell") in the aggregation model is described by a position vector $\mathbf{x}_n$ and associated velocity vector $\mathbf{v}_n$ . The velocity vector is of constant magnitude (speed) and only its angle $\theta_i$ changes: $\mathbf{v}_n = \left[ \begin{smallmatrix} \cos \theta_n \\ \sin \theta_n\end{smallmatrix}\right]$ . For $n = 1,\dots, N$

\begin{align} \theta_n(t + \Delta t) &= \underbrace{\frac{1}{N} \sum_{\vert \mathbf{x}_n - \mathbf{x}_k \vert < R} \theta_k(t)}_{\text{avg.angle of neighbors}} + \underbrace{U([-\eta/2, \eta/2])}_{\text{unif. random perturbation}}\\ \mathbf{x}_n(t+\Delta t) &= \mathbf{x}_{n-1}(t) + \Delta t \mathbf{v}_{n} \end{align}

The first thing to notice here is notation -- $t$ is the 'current' time, while $t + \Delta t$ then means "one step in future". This is similar to the definition od derivative in calculus, where $\begin{equation}f'(t)\approx \frac{f(t+\Delta t) - f(t)}{\Delta t}.\end{equation}$

OK, so the overall structure of this dynamical system is as follows:

FUTURE = FUNCTION( PRESENT )

where FUTURE and PRESENT have the same structure: in our case, containing positions and orientations of the "cells". We call these vectors "states", and we talk about "initial state", "current state", and "next state". We use the name "orbit" if we want to refer to all past states -- a dynamical "movie" of evolution of our dynamical system.

We will now write Python code that simulates such a system.

In [1]:

import numpy; # use the package for numerical (matrix) calculations
N = 5; # number of cells

Each frame ("snapshot") of our movie will be represented by a vector with $N\times 3$ rows -- this is because we need to store three numbers for each cell - orientation $\theta$ , and position point (two coordinates) $\mathbf{x}$ . Since we want $N$ cells, we need $N \times 3$ elements in any snapshot.

Now, before we start to simulate anything, let's give ourselves some random starting cells. We will work with a square domain of size $D = 25$ . Notice below that we'll have to decide what element of state vector will correspond to horizontal, vertical, and angle coordinates. So let's do this: first three elements will correspond to the first cell, second three to the second, and so on. Within each block of 3, first will be angle, second horizontal and third vertical coordinate.

In [2]:

state0 = numpy.zeros( shape=(N*3,)) # set everything to zero
print(state0)

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]

Now, all zeros is a boring state. Below, I set each component to a random number - angles to random number between $[0,2\pi]$ , and spatial coordinates to random numbers between $[0,D]$ . Each time I'll print the intermediate result to see what's going on. The random number generator returns random values between $(0,1)$ so if we want a different range, we have to multiply the output by the maximum. (If we wanted a different lower bound, we'd additionally add a number to the output)

In [3]:

state0[0::3] = numpy.random.rand(N,) * 2 * numpy.pi # set angles to random angles - state0[0::3] means "select every third element starting from 0th (initial element)"
print(state0)

[2.35223395 0.         0.         1.05374273 0.         0.
 6.13784365 0.         0.         0.654006   0.         0.
 4.92356805 0.         0.        ]

By the way, if you wanted to get more information on any function, just type help(name): (for quickhelp you could also use name?)

In [4]:

help(numpy.random.rand)

Help on built-in function rand:

rand(...) method of mtrand.RandomState instance
    rand(d0, d1, ..., dn)
    
    Random values in a given shape.
    
    Create an array of the given shape and populate it with
    random samples from a uniform distribution
    over ``[0, 1)``.
    
    Parameters
    ----------
    d0, d1, ..., dn : int, optional
        The dimensions of the returned array, should all be positive.
        If no argument is given a single Python float is returned.
    
    Returns
    -------
    out : ndarray, shape ``(d0, d1, ..., dn)``
        Random values.
    
    See Also
    --------
    random
    
    Notes
    -----
    This is a convenience function. If you want an interface that
    takes a shape-tuple as the first argument, refer to
    np.random.random_sample .
    
    Examples
    --------
    >>> np.random.rand(3,2)
    array([[ 0.14022471,  0.96360618],  #random
           [ 0.37601032,  0.25528411],  #random
           [ 0.49313049,  0.94909878]]) #random

In [5]:

D = 25
state0[1::3] = numpy.random.rand(N,) * D # set horizontal coordinates 
print(state0)

[ 2.35223395 13.51291043  0.          1.05374273 18.39008161  0.
  6.13784365  8.93053955  0.          0.654006   24.1849503   0.
  4.92356805 23.99564915  0.        ]

In [6]:

state0[2::3] = numpy.random.rand(N,) * D # set horizontal coordinates 
print(state0)

[ 2.35223395 13.51291043 20.99354012  1.05374273 18.39008161  4.74242946
  6.13784365  8.93053955 12.25707003  0.654006   24.1849503  17.66442445
  4.92356805 23.99564915 21.72015455]

OK, printing cells as numbers is boring. Let's graph them.

In [7]:

import matplotlib.pyplot as plt; #this gives us access to plotting functions
%matplotlib inline
# this makes plots appear in the same window as code

ax = plt.axes()
ax.set_xlim(0,D)
ax.set_ylim(0,D)

for n in range(N):
    mytheta, myx, myy = state0[ numpy.array([0,1,2])+3*n ] # extract coordinates for the n-th cell
    ax.arrow( myx, myy, 0.5*numpy.cos(mytheta), 0.5*numpy.sin(mytheta), \
             head_width=0.5, head_length=0.1, fc='k', ec='k') # plot an arrow

plt.show()

Now, we'll likely want to plot cells over and over. So let's package all that code in a function, so we can easily plot cells.

In [8]:

def plotcells(statevector, D=25):
    ax = plt.axes()
    ax.set_xlim(0,D)
    ax.set_ylim(0,D)
    arScale = D/250
    N = numpy.size(statevector) // 3 # determine the number of cells from the size of statevector; // means integer division
    for n in range(N):
        mytheta, myx, myy = statevector[ numpy.array([0,1,2])+3*n ] # extract coordinates for the n-th cell
        ax.arrow( myx, myy, 5*arScale*numpy.cos(mytheta), 5*arScale*numpy.sin(mytheta), \
                 head_width=2*arScale, head_length=arScale, fc='k', ec='k') # plot an arrow

    plt.show()
    return ax

Now we can simply plot our state vector by calling:

In [9]:

ax = plotcells( state0 )

In the spirit of function making, let's create also a function that creates as many initial cells as we want:

In [10]:

def initializeCells(N, D=25):
    state0 = numpy.zeros( shape=(N*3,))
    state0[0::3] = numpy.random.rand(N,) * 2 * numpy.pi
    state0[1::3] = numpy.random.rand(N,) * D
    state0[2::3] = numpy.random.rand(N,) * D
    return state0

Now we can very easily initialize a lot of cells and plot them.

In [11]:

N = 50
state0 = initializeCells(N)
plotcells(state0)

<matplotlib.axes._subplots.AxesSubplot at 0x7f3f7ee8bfd0>

Now it's time to start moving cells around. The "movie" of cells is called an "trajectory". Each element of the trajectory is a snapshot - we will store such a trajectory into a snapshot matrix, where each column will represent one time step.

In [12]:

N = 5
state0 = initializeCells(N)

T = 10; # number of time steps we want to simulate

trajectory = numpy.zeros( shape=(N*3, T));
trajectory[:,0] = state0
numpy.set_printoptions(precision=1) # print at most 2 decimals to keep things compact
print(trajectory)

[[ 2.5  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [24.5  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [21.3  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [ 5.5  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [11.5  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [23.8  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [ 4.8  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [18.3  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [21.8  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [ 2.1  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [18.1  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [18.1  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [ 0.8  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [22.8  0.   0.   0.   0.   0.   0.   0.   0.   0. ]
 [ 0.5  0.   0.   0.   0.   0.   0.   0.   0.   0. ]]

OK, so our trajectory is pretty boring - everything just goes to zero. Instead, we have to implement our present -> future function. Let's call that function "update rule". Remember, its mathematical form was

\begin{align} \theta_n(t + \Delta t) &= \underbrace{\frac{1}{N} \sum_{\vert \mathbf{x}_n - \mathbf{x}_k \vert < R} \theta_k(t)}_{\text{avg.angle of neighbors}} + \underbrace{U([-\eta/2, \eta/2])}_{\text{unif. random perturbation}}\\ \mathbf{v}_n &= \left[ \begin{smallmatrix} \cos(\theta_k) \\ \sin(\theta_k)\end{smallmatrix}\right] \\ \mathbf{x}_n(t+\Delta t) &= \mathbf{x}_{n-1}(t)+ \Delta t \mathbf{v}_{n} \end{align}

Let's write a first draft of this function:

In [13]:

def update_rule_1(present, stepsize = 1):
    future = present # first set the future to whatever present was
    N = numpy.size(present) // 3
    for n in range(N):
        mytheta = present[3*n]
        v = numpy.array([numpy.cos(mytheta), numpy.sin(mytheta)])
        future[ numpy.array([1,2]) + 3*n ] = present[ numpy.array([1,2])+3*n ] + stepsize*v
    return(future)

We can now iterate the update rule for the trajectory and plot first three steps

In [14]:

for t in range(T-1):
    trajectory[:,t+1] = update_rule_1( trajectory[:,t], 3) # repeatedly apply update rule to columns of our trajectory matrix

for t in range(3):
    plotcells(trajectory[:,t])

We can further show the track of each individual cell as a line to see what they are doing.

In [15]:

def plottrajectories(trajectory, D=25):
    ax = plt.axes()
    ax.set_xlim(0,D)
    ax.set_ylim(0,D)
    plt.plot( trajectory[1::3,:].T, trajectory[2::3,:].T ,'.')

plottrajectories(trajectory)

We already see there are problems. First, some cells clearly go off screen. Second, cells don't really change their orientation. Let's tackle these things in order, by rewriting our update_rule.

To prevent cells from exiting the domain, we'll just periodize the domain. This is easily done using mod() function: mod calculates remainder of division. So, if a cell steps into location 26, then because 26 = 1 + 25, mod(26,25) = 1 so the cell reappears on the other side of the domain.

In [16]:

def update_rule_2(present, stepsize = 1, D=25):
    future = present # first set the future to whatever present was
    N = numpy.size(present) // 3
    for n in range(N):
        mytheta = present[3*n]
        v = numpy.array([numpy.cos(mytheta), numpy.sin(mytheta)])
        future[ numpy.array([1,2]) + 3*n ] = present[ numpy.array([1,2])+3*n ] + stepsize*v
        future[ numpy.array([1,2]) + 3*n ] = numpy.mod( future[ numpy.array([1,2]) + 3*n ], D) # <-- this is the added line
    return(future)

# let's rerun everything
for t in range(T-1):
    trajectory[:,t+1] = update_rule_2( trajectory[:,t], 3) # repeatedly apply update rule to columns of our trajectory matrix

plottrajectories(trajectory)

Notice that now dots that go off screen appear on the other end.

Now for the second task: change in orientation. We will also add speed as the parameter $v_0$ in the process.

\theta_n(t + \Delta t) = \underbrace{\frac{1}{N} \sum_{\vert \mathbf{x}_n - \mathbf{x}_k \vert < R} \theta_k(t)}_{\text{avg.angle of neighbors}} + \underbrace{U([-\eta/2, \eta/2])}_{\text{unif. random perturbation}}

We will first add the random component and then the average angle of neighbors.

In [17]:

def update_rule_3(present, stepsize = 1, eta=0.1, D=25, v0=0.03):
    future = present # first set the future to whatever present was
    N = numpy.size(present) // 3
    for n in range(N):
        mytheta = present[3*n]
        v = v0*numpy.array([numpy.cos(mytheta), numpy.sin(mytheta)])
        future[ numpy.array([1,2]) + 3*n ] = present[ numpy.array([1,2])+3*n ] + stepsize*v
        future[ numpy.array([1,2]) + 3*n ] = numpy.mod( future[ numpy.array([1,2]) + 3*n ], D)

    # add noise to angle
    noise = (-eta/2 + numpy.random.rand( N, )* eta/2 )
    future[0::3] = present[0::3] + noise
    
    return(future)

Let's rerun everything but for longer time and smaller time step, to give the cells a chance to wander about

In [18]:

T = 100; # number of time steps we want to simulate
trajectory = numpy.zeros( shape=(N*3, T));
trajectory[:,0] = state0
for t in range(T-1):
    trajectory[:,t+1] = update_rule_3( trajectory[:,t], v0=1 ) # repeatedly apply update rule to columns of our trajectory matrix

plottrajectories(trajectory)

Alright! Our cells are wandering around as if drunk, but they don't sense each other - each is doing its own thing. The synchronization comes from the neighbor part of the update rule, which we have yet to implement.

By the way, randomly moving cells don't wander far in 2D but they do in 3D. http://pi.math.cornell.edu/~mec/Winter2009/Thompson/randomwalks.html

Now, neighbors are defined by the subscript of the summation sign in $\underbrace{\frac{1}{N} \sum_{\vert \mathbf{x}_n - \mathbf{x}_k \vert < R} \theta_k(t)}_{\text{avg.angle of neighbors}}$ that is: neighbors are those whose coordinates are closer than some distance $R$ .

We are going to implement that by cycling through the list of cells, computing the distance between current cell position and all others, and every time we notice a neighbor, add its angle to the list of angles we'll average.

In [19]:

def update_rule_4(present, stepsize = 1, eta=0.1, D=25, R = 1, v0=0.03):
    future = present # first set the future to whatever present was
    N = numpy.size(present) // 3
    neighborMeanAngles = numpy.zeros(N,)    
    for n in range(N):
        mytheta = present[3*n]
        mypos = present[numpy.array([1,2]) + 3*n ]
        
        ## START NEW CODE
        angleList = [] # list of angles to average is empty at first

        for m in range(N): # loop through all other cells            
            themPos = present[numpy.array([1,2]) + 3*m ] 
            if (numpy.linalg.norm(mypos - themPos) < R) : # distance between vectors is less than R
                angleList.append( present[3*m] ) # add to list of angles to average
            
        neighborMeanAngles[n] = numpy.sum( numpy.array( angleList ) ) / len(angleList)
        ## FINISH NEW CODE
        v = v0*numpy.array([numpy.cos(mytheta), numpy.sin(mytheta)])
        future[ numpy.array([1,2]) + 3*n ] = present[ numpy.array([1,2])+3*n ] + stepsize*v
        future[ numpy.array([1,2]) + 3*n ] = numpy.mod( future[ numpy.array([1,2]) + 3*n ], D)

    noise = (-eta/2 + numpy.random.rand( N, )* eta/2 ) # add noise to angle
    future[0::3] = present[0::3] + neighborMeanAngles + noise
    
    return(future)

Now, let's run our simulation loop one more time, with the new update rule:

In [20]:

N = 30
state0 = initializeCells(N)

T = 300; # number of time steps we want to simulate

trajectory = numpy.zeros( shape=(N*3, T));
trajectory[:,0] = state0

for t in range(T-1):
    trajectory[:,t+1] = update_rule_4( trajectory[:,t], stepsize=5, eta=0.1, R=1 ) # repeatedly apply update rule to columns of our trajectory matrix

plottrajectories(trajectory)

This is now a mess - we have no idea where cells started, where they have ended, etc. (We used a slightly larger stepsize=5 to increase the speed.) So let's plot side by side the first frame of our simulation and the last to see if we can spot a difference in clustering.

In [21]:

plotcells(trajectory[:,0])
plotcells(trajectory[:,-1])

<matplotlib.axes._subplots.AxesSubplot at 0x7f3f7d10f828>

I can't really tell if there is a significant change in clustering. Perhaps the density of cells is too small, perhaps the radius of interaction is too small (most likely). So we'll have to play with the parameters... but first, notice that the simulation took a bit of time to complete.

If we increased the number of cells and time, this simulation would take an even-longer time. This is because for every time step, we are running through cells twice - once to find neighbors, and once to update cell values. We can shorten the neighbor calculation by using scipy.spatial.distance.pdist function, which computes pairwise distances between all cells in one go. Then we can just select those cells that are nearby.

In [22]:

import scipy.spatial
def update_rule_5(present, stepsize = 1, eta=0.1, D=25, R = 1, v0=0.03):
    future = present # first set the future to whatever present was
    N = numpy.size(present) // 3
    neighborMeanAngles = numpy.zeros(N,)
    
    # create a matrix where each row is a cell, and columns are x-y coordinates
    positions = numpy.zeros((N,2))
    positions[:,0] = present[1::3] 
    positions[:,1] = present[2::3]
    angles = present[0::3]
    
    # first, identify all neighbors at once
    Dmatrix = scipy.spatial.distance.pdist(positions) 
    Dmatrix = scipy.spatial.distance.squareform( Dmatrix ) # Dmatrix[i,j] is a distance between cells i and j
    Neighbors = Dmatrix <= R # Neighbors[i,j] = true if distance between i and j is closer than R
    # numpy.fill_diagonal(Neighbors, False) # make sure that cell is never it's own neighbor Neighbors[i,i] = False!
    
    for n in range(N):
        mytheta = angles[n] # get cell angle
        mypos = positions[n,:] # get cell position
        
        sel = Dmatrix[:,n] < R # selection vector - true where distance is smaller than R, false otherwise
        neighborMeanAngles[n] = numpy.sum( angles[Neighbors[:,n]] ) / numpy.sum( Neighbors[:,n] ) # compute mean neighbor angle
        
         
        v = v0*numpy.array([numpy.cos(mytheta), numpy.sin(mytheta)])
        future[ numpy.array([1,2]) + 3*n ] = present[ numpy.array([1,2])+3*n ] + stepsize*v
        future[ numpy.array([1,2]) + 3*n ] = numpy.mod( future[ numpy.array([1,2]) + 3*n ], D)

    noise = (-eta/2 + numpy.random.rand( N, )* eta/2 ) # add noise to angle
    future[0::3] = numpy.mod( present[0::3] + neighborMeanAngles + noise, 2*numpy.pi )
    
    return(future)

Once we have everything coded up, we can think of this entire process of simulating cells as an experiment.

Inputs: $\eta$ (noise level), $D$ (box size), $R$ (neighborhood range), $N$ (number of cells), $T$ (duration), $stepsize$ (timestep).
Outputs: plot of trajectories, plot of initial state, plot of final state.

So let's write a function experiment that takes in these parameters and runs our usual simulation loop.

In [23]:

def experiment(N=30, T=1000, R=1, D=25, eta=0.1, stepsize=1): # values listed here are 'defaults': they are used if we don't specify the appropriate constant
    state0 = initializeCells(N, D=D)
    trajectory = numpy.zeros( shape=(N*3, T));
    trajectory[:,0] = state0

    for t in range(T-1):
        trajectory[:,t+1] = update_rule_5( trajectory[:,t], stepsize=stepsize, eta=eta, R=R, D=D ) # repeatedly apply update rule to columns of our trajectory matrix

    plt.figure(0)
    plottrajectories(trajectory,D=D) # trajectory plot
    plt.figure(1)
    plotcells(trajectory[:,0], D=D) # initial state
    plt.figure(2)
    plotcells(trajectory[:,-1], D=D) #final state

Now, we're ready to produce some experiments

In [24]:

experiment() # default parameters

In [25]:

experiment(N=300) # more cells

In [26]:

experiment(N=300, T=3000) # more cells and more time

I'd say that the cells look somewhat more clustered now! Now, take a look at Topaz paper and see what parameters they use, and try to recreate the pictures that they have. You won't get the exact pictures because random initial conditions will make the graphs slightly different, but at least qualitatively things should be alike.

Notice we can play with all parameters that are defined in the "signature" of the function (first line): def experiment(N=30, T=1000, R=1, D=25, eta=0.1, stepsize=1). For example, we can shrink the domain. With $R=1$ (default value), this means that cells can sense a much larger proportion of the domain than before.

In [27]:

experiment(D=5, N=300, T=1000)