Project: Ying Jia - MATH0094-2020/2021

Path: Notebooks / Week 7 / 7- Mean-variance analysis with simulated data.ipynb

Views: ²⁸⁶⁸
Image: ubuntu2004

Kernel: Python 3 (system-wide)

Mean-variance analysis with simulated data

Camilo A. Garcia Trillos - 2020

In this class

- We are implementing a Markowitz portfolio optimization, after some simulations (loosely based on a version on the blog http://blog.quantopian.com).
- We learn how to perform an optimisation using the scipy package.

In [3]:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy.optimize import minimize
np.random.seed(123)

Markowitz mean-variance analysis

In this class we will start by simulated random data. Later we will learn how to use actual stock data.

Assume that we have $4$ assets, each with a return series of length $1000$ . We can use numpy.random.randn to sample returns from a normal distribution (recall Class A).

In [4]:

## NUMBER OF ASSETS
n_assets = 4

## NUMBER OF OBSERVATIONS
n_obs = 1000

## VECTOR OF MEANS
mu_bar = np.array([0.,0.,0., 0.])
#mu_bar = np.array([0.001,0.002,0.005, 0.008])

## MATRIX OF COVARIANCE

#With this auxSigma matrix, inverting is stable
auxSigma = np.array ( [[1.,0.,0.,0.],[0,1,0,0],[0,0,1,0],[0,0,0,1]]) 

#With this auxSigma matrix, inverting is less stable
#auxSigma = np.array ( [[2,0.,0.,0.],[-0.2,0.7,0,0],[-0.3,0.4,0.7,0],[0.1,0.4,0.4,2.3]]) 

Sigma_exact =auxSigma.T@auxSigma
print (Sigma_exact)

[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]]

In [5]:

return_vec = np.random.multivariate_normal(mu_bar,Sigma_exact,n_obs)
print (return_vec.shape)

(1000, 4)

In [4]:

plt.plot(return_vec, alpha=0.3);
plt.xlabel('Time')
plt.ylabel('Returns')

Text(0, 0.5, 'Returns')

Random portfolios

Let's generate a couple of random portfolios. Let's assume for the moment that we restrict ourselves to have "long-only" portfolios, that is, the weights are nonnegative.

In [6]:

def rand_weights_pos(n):
    k = np.random.rand(n)
    return k / sum(k)

print(rand_weights_pos(n_assets))
print(rand_weights_pos(n_assets))

[0.54066805 0.2360283  0.11660484 0.1066988 ]
[0.27638339 0.03006307 0.47850085 0.21505269]

Let's next compute the expected return and standard deviation for a portfolio.

In [7]:

def random_portfolio(returns, weights):
    if returns.shape[1] != weights.shape[0]:
        print("Error: MSG by random_portfolio: dimensions don't fit")
        return 0

    w = np.asmatrix(weights)
    p = np.asmatrix(np.mean(returns, axis=0))
    C = np.asmatrix(np.cov(returns.T))
    
    mu = w * p.T
    sigma = np.sqrt(w * C * w.T)
    
    return mu, sigma

Now, let's plot several such random risk-return profiles:

In [8]:

n_portfolios = 3000
means, stds = np.column_stack([random_portfolio(return_vec, rand_weights_pos(n_assets)) 
                               for _ in range(n_portfolios)])
plt.plot(stds, means,'o', markersize=5)
plt.xlabel('std')
plt.ylabel('mean')
plt.title('Mean and standard deviation of returns of randomly generated portfolios (long-only)')

Text(0.5, 1.0, 'Mean and standard deviation of returns of randomly generated portfolios (long-only)')

In [9]:

def rand_weights_unconstrained(n):
    k = np.random.rand(n)*2-1      # Uniformly distributed on [-1,1]
    return k / sum(k)

means_unc, stds_unc = np.column_stack([random_portfolio(return_vec, 
                        rand_weights_unconstrained(n_assets)) for _ in range(2 * n_portfolios)])
plt.plot(stds_unc, means_unc, 'og', markersize=5)
plt.xlabel('std')
plt.ylabel('mean')
plt.title('Mean and standard deviation of returns of randomly generated portfolios')

Text(0.5, 1.0, 'Mean and standard deviation of returns of randomly generated portfolios')

In [10]:

plt.plot(stds_unc[stds_unc<0.9], means_unc[stds_unc<0.9], 'og', 
         markersize=5, alpha=0.4)
plt.plot(stds, means, 'o', markersize=5, alpha=0.2)
plt.xlabel('std')
plt.ylabel('mean')
plt.title('Mean and standard deviation of returns of randomly generated portfolios')

Text(0.5, 1.0, 'Mean and standard deviation of returns of randomly generated portfolios')

Let us now check the covariance matrix and find its inverse.

In [11]:

Sigma = np.cov(return_vec.T)
print(Sigma)

[[ 0.95457653  0.00415482  0.04630785 -0.03329693]
 [ 0.00415482  0.94767103  0.00220586 -0.0675899 ]
 [ 0.04630785  0.00220586  1.02460895  0.06388503]
 [-0.03329693 -0.0675899   0.06388503  0.94171693]]

Let us check the stability of the matrix with the condioned number

In [12]:

np.linalg.cond(Sigma)

1.26406561814308

This is not too bad. We could move on without expecting major stability problems.

In [13]:

Sigma_inv = np.linalg.inv(Sigma)

Markowitz, as done in the lecture notes

In [14]:

def Markowitz_weights(Sigma, mean_returns, mu_p):
    S_inv = np.linalg.inv(Sigma)
    pi_mu = S_inv @ (mean_returns)/sum(S_inv @ mean_returns)
    pi_1 = np.sum(S_inv, axis = 1) / sum(np.sum(S_inv, axis = 1) )
    lambda_demoninator = (mean_returns @ pi_mu) - (mean_returns@pi_1)
    ll = np.array((mu_p - (mean_returns @ pi_1))/lambda_demoninator)
    # previous line: to convert into array in case that mu_p is a single number
    ll.shape=(ll.size,1)
    return pi_mu * ll + pi_1 * (1-ll)

Remark: recall from the lecture that Markowitz needs at least two expected returns to differ. This is not the case via our simulated data. Indeed, any portfolio will have a zero expected return and the equally weighted portfolio will have the smallest variance. Let's just proceed, nevertheless, and see how far we get ...

In [15]:

Sigma = np.cov(return_vec.T)
mean_returns = np.mean(return_vec, axis = 0)
Markowitz_weights(Sigma, mean_returns, 0.05)

array([[-0.14820352,  0.83623086, -0.24774642,  0.55971907]])

In [16]:

mu_p = np.arange(0.02,0.06,0.002)

In [17]:

Markowitz_weights(Sigma, mean_returns, mu_p)

array([[ 1.77270001e-01,  3.76150124e-01,  1.22801901e-01,
         3.23777975e-01],
       [ 1.55571766e-01,  4.06822173e-01,  9.80986790e-02,
         3.39507381e-01],
       [ 1.33873532e-01,  4.37494223e-01,  7.33954574e-02,
         3.55236788e-01],
       [ 1.12175297e-01,  4.68166272e-01,  4.86922358e-02,
         3.70966195e-01],
       [ 9.04770630e-02,  4.98838322e-01,  2.39890143e-02,
         3.86695601e-01],
       [ 6.87788285e-02,  5.29510371e-01, -7.14207263e-04,
         4.02425008e-01],
       [ 4.70805940e-02,  5.60182420e-01, -2.54174288e-02,
         4.18154415e-01],
       [ 2.53823595e-02,  5.90854470e-01, -5.01206504e-02,
         4.33883821e-01],
       [ 3.68412503e-03,  6.21526519e-01, -7.48238719e-02,
         4.49613228e-01],
       [-1.80141094e-02,  6.52198568e-01, -9.95270935e-02,
         4.65342635e-01],
       [-3.97123439e-02,  6.82870618e-01, -1.24230315e-01,
         4.81072041e-01],
       [-6.14105784e-02,  7.13542667e-01, -1.48933537e-01,
         4.96801448e-01],
       [-8.31088129e-02,  7.44214716e-01, -1.73636758e-01,
         5.12530855e-01],
       [-1.04807047e-01,  7.74886766e-01, -1.98339980e-01,
         5.28260261e-01],
       [-1.26505282e-01,  8.05558815e-01, -2.23043201e-01,
         5.43989668e-01],
       [-1.48203516e-01,  8.36230864e-01, -2.47746423e-01,
         5.59719075e-01],
       [-1.69901751e-01,  8.66902914e-01, -2.72449644e-01,
         5.75448481e-01],
       [-1.91599985e-01,  8.97574963e-01, -2.97152866e-01,
         5.91177888e-01],
       [-2.13298220e-01,  9.28247013e-01, -3.21856087e-01,
         6.06907295e-01],
       [-2.34996454e-01,  9.58919062e-01, -3.46559309e-01,
         6.22636701e-01]])

In [18]:

Mw1 = _

Markowitz via an optimization algorithm

Next a different that also allows for additional constraints (e.g., nonnegative portfolio weights).

In [0]:

from scipy.optimize import minimize

minimize?

In [19]:

def Markowitz_weights_opt(Sigma, mean_returns, mu_p, nonnegative = False, print_info = False):
    
    n = Sigma.shape[0]    
    mu_p_vect = np.array(mu_p)
    mu_p_vect.shape = (mu_p_vect.size,)
    # previous lines: to convert into array in case that mu_p is a single number
    portfolios = []
    x0 = np.ones(n)/n

    cost_fun = lambda x: (x@Sigma)@x
    
    for mu_p in mu_p_vect:        
        
        if nonnegative:
            cons = ({'type': 'eq', 'fun': lambda x: x@mean_returns - mu_p},
                    {'type': 'eq', 'fun': lambda x: x@np.ones(n) - 1},
                    {'type': 'ineq', 'fun': lambda x: x})

        else:
            cons = ({'type': 'eq', 'fun': lambda x: x@mean_returns - mu_p},
                    {'type': 'eq', 'fun': lambda x: x@np.ones(n) - 1})
                   
        # Calculate efficient frontier weights using quadratic programming
        res = minimize(cost_fun, x0, method = 'SLSQP', constraints=cons)
        if print_info:
            print(mu_p, res)
        portfolios.append(res.x)
    
    

    # Turning the portfolios into a numpy array
    portfolios = np.array([np.array(p).squeeze() for p in portfolios])
    
    return portfolios

In [20]:

Markowitz_weights_opt(Sigma, mean_returns, 0.012)

array([[0.26426984, 0.2540042 , 0.22165789, 0.26006807]])

In the following yet another implementation. Now, we are using the constraint $\mu^T \pi \geq \mu_p$ instead of $\mu^T \pi = \mu_p$ .

In [21]:

def Markowitz_weights_optIneq(Sigma, mean_returns, mu_p_vect, nonnegative = False, print_info = False):
    
    n = Sigma.shape[0]    
    mu_p_vect = np.array(mu_p_vect)
    mu_p_vect.shape = (mu_p_vect.size,)
    # previous lines: to convert into array in case that mu_p is a single number
    portfolios = []
    x0 = np.ones(n)/n

    cost_fun = lambda x: (x@Sigma)@x
    
    for mu_p in mu_p_vect:        
        
        if nonnegative:
            cons = ({'type': 'ineq', 'fun': lambda x: x@mean_returns - mu_p},
                    {'type': 'eq', 'fun': lambda x: x@np.ones(n) - 1},
                    {'type': 'ineq', 'fun': lambda x: x})

        else:
            cons = ({'type': 'ineq', 'fun': lambda x: x@mean_returns - mu_p},
                    {'type': 'eq', 'fun': lambda x: x@np.ones(n) - 1})
                   
        # Calculate efficient frontier weights using quadratic programming
        res = minimize(cost_fun, x0, method = 'SLSQP', constraints=cons)
        if print_info:
            print(mu_p, res)
        portfolios.append(res.x)
        
    # Calculate efficient frontier weights using quadratic programming
    #portfolios = [opt.solvers.qp(S, opt.matrix(np.zeros(n)), G, h, A, opt.matrix([1.0,mu]))['x'] 
    #              for mu in mu_p]

    # Turning the portfolios into a numpy array
    portfolios = np.array([np.array(p).squeeze() for p in portfolios])
    
    return portfolios

In [22]:

Markowitz_weights_opt(Sigma, mean_returns, 0.02)

array([[0.17722274, 0.3761428 , 0.1228422 , 0.32379226]])

In [23]:

Markowitz_weights_optIneq(Sigma, mean_returns, 0.02)

array([[0.17722274, 0.3761428 , 0.1228422 , 0.32379226]])

In [24]:

Markowitz_weights_opt(Sigma, mean_returns, mu_p)

array([[ 1.77222742e-01,  3.76142797e-01,  1.22842200e-01,
         3.23792261e-01],
       [ 1.55524588e-01,  4.06812914e-01,  9.81380733e-02,
         3.39524425e-01],
       [ 1.33827210e-01,  4.37483227e-01,  7.34333174e-02,
         3.55256246e-01],
       [ 1.12175308e-01,  4.68166247e-01,  4.86922151e-02,
         3.70966229e-01],
       [ 9.04770623e-02,  4.98838321e-01,  2.39890150e-02,
         3.86695601e-01],
       [ 6.87788279e-02,  5.29510372e-01, -7.14205968e-04,
         4.02425006e-01],
       [ 4.70805922e-02,  5.60182422e-01, -2.54174260e-02,
         4.18154412e-01],
       [ 2.53823589e-02,  5.90854469e-01, -5.01206496e-02,
         4.33883822e-01],
       [ 3.68412398e-03,  6.21526518e-01, -7.48238709e-02,
         4.49613229e-01],
       [-1.80141092e-02,  6.52198568e-01, -9.95270934e-02,
         4.65342634e-01],
       [-3.97122228e-02,  6.82870473e-01, -1.24230489e-01,
         4.81072239e-01],
       [-6.14104911e-02,  7.13542570e-01, -1.48933659e-01,
         4.96801580e-01],
       [-8.31086328e-02,  7.44214538e-01, -1.73637001e-01,
         5.12531096e-01],
       [-1.04807047e-01,  7.74886766e-01, -1.98339980e-01,
         5.28260260e-01],
       [-1.26505153e-01,  8.05558699e-01, -2.23043370e-01,
         5.43989824e-01],
       [-1.48203105e-01,  8.36230507e-01, -2.47746956e-01,
         5.59719554e-01],
       [-1.69901625e-01,  8.66902806e-01, -2.72449807e-01,
         5.75448626e-01],
       [-1.91599983e-01,  8.97574964e-01, -2.97152867e-01,
         5.91177886e-01],
       [-2.13298221e-01,  9.28247011e-01, -3.21856086e-01,
         6.06907296e-01],
       [-2.34996454e-01,  9.58919063e-01, -3.46559308e-01,
         6.22636699e-01]])

In [25]:

Markowitz_weights_optIneq(Sigma, mean_returns, mu_p)

array([[ 1.77222742e-01,  3.76142798e-01,  1.22842200e-01,
         3.23792260e-01],
       [ 1.55524588e-01,  4.06812912e-01,  9.81380717e-02,
         3.39524429e-01],
       [ 1.33827210e-01,  4.37483229e-01,  7.34333178e-02,
         3.55256243e-01],
       [ 1.12175298e-01,  4.68166272e-01,  4.86922354e-02,
         3.70966194e-01],
       [ 9.04770710e-02,  4.98838306e-01,  2.39889999e-02,
         3.86695624e-01],
       [ 6.87788286e-02,  5.29510371e-01, -7.14207308e-04,
         4.02425008e-01],
       [ 4.70805965e-02,  5.60182420e-01, -2.54174313e-02,
         4.18154415e-01],
       [ 2.53823837e-02,  5.90854436e-01, -5.01206874e-02,
         4.33883868e-01],
       [ 3.68412479e-03,  6.21526519e-01, -7.48238718e-02,
         4.49613228e-01],
       [-1.80141097e-02,  6.52198568e-01, -9.95270930e-02,
         4.65342635e-01],
       [-3.97123351e-02,  6.82870607e-01, -1.24230328e-01,
         4.81072056e-01],
       [-6.14104338e-02,  7.13542510e-01, -1.48933738e-01,
         4.96801661e-01],
       [-8.31088149e-02,  7.44214716e-01, -1.73636756e-01,
         5.12530855e-01],
       [-1.04807051e-01,  7.74886767e-01, -1.98339976e-01,
         5.28260260e-01],
       [-1.26505152e-01,  8.05558698e-01, -2.23043371e-01,
         5.43989826e-01],
       [-1.48203309e-01,  8.36230686e-01, -2.47746691e-01,
         5.59719314e-01],
       [-1.69901753e-01,  8.66902912e-01, -2.72449642e-01,
         5.75448483e-01],
       [-1.91599984e-01,  8.97574964e-01, -2.97152866e-01,
         5.91177885e-01],
       [-2.13298221e-01,  9.28247012e-01, -3.21856086e-01,
         6.06907295e-01],
       [-2.34996458e-01,  9.58919063e-01, -3.46559305e-01,
         6.22636699e-01]])

In [26]:

Mw1

array([[ 1.77270001e-01,  3.76150124e-01,  1.22801901e-01,
         3.23777975e-01],
       [ 1.55571766e-01,  4.06822173e-01,  9.80986790e-02,
         3.39507381e-01],
       [ 1.33873532e-01,  4.37494223e-01,  7.33954574e-02,
         3.55236788e-01],
       [ 1.12175297e-01,  4.68166272e-01,  4.86922358e-02,
         3.70966195e-01],
       [ 9.04770630e-02,  4.98838322e-01,  2.39890143e-02,
         3.86695601e-01],
       [ 6.87788285e-02,  5.29510371e-01, -7.14207263e-04,
         4.02425008e-01],
       [ 4.70805940e-02,  5.60182420e-01, -2.54174288e-02,
         4.18154415e-01],
       [ 2.53823595e-02,  5.90854470e-01, -5.01206504e-02,
         4.33883821e-01],
       [ 3.68412503e-03,  6.21526519e-01, -7.48238719e-02,
         4.49613228e-01],
       [-1.80141094e-02,  6.52198568e-01, -9.95270935e-02,
         4.65342635e-01],
       [-3.97123439e-02,  6.82870618e-01, -1.24230315e-01,
         4.81072041e-01],
       [-6.14105784e-02,  7.13542667e-01, -1.48933537e-01,
         4.96801448e-01],
       [-8.31088129e-02,  7.44214716e-01, -1.73636758e-01,
         5.12530855e-01],
       [-1.04807047e-01,  7.74886766e-01, -1.98339980e-01,
         5.28260261e-01],
       [-1.26505282e-01,  8.05558815e-01, -2.23043201e-01,
         5.43989668e-01],
       [-1.48203516e-01,  8.36230864e-01, -2.47746423e-01,
         5.59719075e-01],
       [-1.69901751e-01,  8.66902914e-01, -2.72449644e-01,
         5.75448481e-01],
       [-1.91599985e-01,  8.97574963e-01, -2.97152866e-01,
         5.91177888e-01],
       [-2.13298220e-01,  9.28247013e-01, -3.21856087e-01,
         6.06907295e-01],
       [-2.34996454e-01,  9.58919062e-01, -3.46559309e-01,
         6.22636701e-01]])

All three methods therefore yield more or less the same portfolio, which is nice ... (Markowitz_weights_optIneq only have an inequality constraint and yields results that have further away from the other two approaches).

(If we run again the problem with different means and variances the solution with inequality constraints drift further away. This is simply because those means cannot be attained using only positive portfolios).

In [27]:

portfolio_mu = [mean_returns @ x for x in Mw1]

In [28]:

portfolio_mu

[0.02,
022,
023999999999999997,
025999999999999995,
02799999999999999,
029999999999999992,
03199999999999999,
033999999999999996,
03599999999999999,
03799999999999999,
03999999999999998,
04199999999999998,
043999999999999984,
04599999999999998,
04799999999999999,
049999999999999975,
05199999999999998,
05399999999999999,
05599999999999998,
05799999999999997]

In [29]:

portfolio_sigma = [np.sqrt((x@Sigma)@x) for x in Mw1]

In [30]:

portfolio_sigma

[0.515581632416369,
5315419321289605,
5509491496500153,
5734534325916862,
5987056431709378,
6263735219026193,
6561515561812542,
6877657118379038,
7209744919504029,
755567664520399,
7913637009977808,
8282066486133915,
8659628931544197,
9045180737805882,
9437742808709849,
9836475866049637,
0240659109629717,
064967201054332,
106297890538078,
1480116026673817]

In [31]:

plt.plot(stds, means, 'o', markersize=3)
plt.plot(stds_unc[stds_unc<1], means_unc[stds_unc<1], 'og', 
         markersize=3, alpha=0.4)
plt.ylabel('mean')
plt.xlabel('std')
plt.plot(portfolio_sigma, portfolio_mu, 'ro')
plt.title('Random portfolios and mean-variance frontier')

Text(0.5, 1.0, 'Random portfolios and mean-variance frontier')

In [0]:

Mean-variance analysis with simulated data

In this class

Markowitz mean-variance analysis

Random portfolios

Markowitz, as done in the lecture notes

Markowitz via an optimization algorithm

Product

Resources

Company