Load Packages
Load Data
The data loading section is done for you, so you can focus on the other tasks.
Linear regressions
We run OLS of each of the returns on (1, Rme, RSMB, RHML, RMOM)
and report the regression coeffs and their t-stats. The returns are 6 portfolios formed on Size and Book-to-Market that can be retrieved from Kenneth R. French's website.
Task 1 (35 points)
Estimate the coefficients using Maximum Likelhood Estimation (MLE) of the returns on (1, Rme, RSMB, RHML, RMOM)
. Report the regression coeffs. The goal is to show that under normal error assumption, as is typically assumed in linear regression, MLE and the OLS lead to identical estimates.
Hint 1:
Hint 2: optimum.minizer
contains the estimates.
Hint 3: When experiencing troubles try different optimization methods:
NelderMead()
SimulatedAnnealing()
GradientDescent()
see http://julianlsolvers.github.io/Optim.jl/latest/#algo/nelder_mead/ for further methods
Task 2 (20 points)
This excercise is for ambitious students who would like to receive a high mark.
Suppose that we wish to test if the the parameter estimates of are statistically different from zero. Suppose further that we do not know how to compute analytically the standard errors of the MLE parameter estimates.
We decide to (non-parametrically) bootstrap by resampling cases in order to estimate the standard errors. This means that we treat the sample of N individuals as if it were a population from which we randomly draw B samples, each of size N. This produces a sample of MLEs of size B, that is, it provides an empirical approximation to the distribution of the MLE. From the empirical approximation, we can compare the full-sample point MLE to the MLE distribution under the null hypothesis.
Bootstrap samples can be easily generated using the built-in function sample()
. Each bootstrap sample should be drawn with replacment from the original sample and should have the same number of observations.
Proceed in three steps:
1.) Make sure that the log likelihood function has both independent and dependent variables as an input. 2.)et up a bootstrap function for the standard errors (see Hint 1).
Hint 1:
Hint 2: When running into errors, try to reduce the number of boostrap samples to something smaller than 1,000 and then gradually increase it.
Task 4 (15 points)
This excercise is for ambitious students who would like to receive a high mark. Please note that you do not need to solve Task 3 in order to give an elaborate answer here.
Give a brief inpretation of your result i.e. why bootstrapped confidence bands provide better finite sample performance. Be brief (no more than 100 words).
Comment
Since bootstrapping creates multiple resamples with replacement from a single finite set of observations, even if the data is not exactly normally distributed, the large number of indepedent resamples will approach normal distribution due to Central Limit Theorem with a sufficiently large sample size.
This removes the need for the assumption of normal distribution to be made in the linear regression.
Hence, in the case where the original sample set is not normally distributed, the bootstrapped confidence bands which are constructed based on the 25th ranked value and 95th ranked value (assuming 1000 resamples) are likely to provide better performance than simply taking the 5th and 95th percentile values in the sample dataset for hypothesis testing.
Task 5 (30 points)
Replicating OLS by MLE in Task 1 might look boring and one is tempted to ask why going through the effort of deriving a log-likelihood function when OLS just works fine? Therefore, we now turn to a more hands-on application of MLE.
Imagine, that after your degree, you start working in the risk managament division of a large hedge fund on the Cayman Islands. On the first day your manager gives you the following task:
You get some data (
mledata.csv
) containing the maximum drawdown every month of one of the funds (numbers are in percentages) - load the data and plot a histogram and make an educated guess about the underlying distribution (1-2 sentences are enough)Estimate the parameters and of a gamma(,) distribution using MLE
Based on the estimated parameters, what is the expected maximum drawdown for next month?
Hin 1: use SpecialFunctions
to calculate .
using Pkg Pkg.add("SpecialFunctions")
Hint 2: the gamma distribution PDF takes the form:
.
Hint 3: The maximum drawdown in this dataset is the maximum pecentage loss in each month relative to the highest value.