The standard correlation matrix assumes data is distributed according to a one-way multivariate model. This assumption is violated in spatio-temporal measurements that come from fMRI.
Conclusion: Standard pairwise correlation intended for one-way measurements is a statistically biased estimator for uncentered two-way measurements
Default simulation uses p=50
brain regions and m=200
BOLD fMRI volumes or measurements with null correlation matrix sigma
% Simulate one-way multivariate data
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Dimensions or Features
p = 50;
% Measurements or Observations
m = 200;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Multivariate, i.i.d normal data
rcen_mean = @(mu)(repmat(mu,[m 1]));
rcen_mvnsim = @(mu,sigma)(rcen_mean(mu) + randn(m,p)*sqrtm(sigma));
title = @(tstring)(title(tstring,'fontsize',16));
mu = zeros(1,p);
sigma = eye(p);
X = rcen_mvnsim(mu,sigma);
figure;
subplot(1,2,1);
imagesc(rcen_mean(mu)); colorbar;
title('Row-centered, Mean')
subplot(1,2,2);
Shat_centered = corr(X);
imagesc(Shat_centered); colormap(summer); colorbar; axis image equal;
title(sprintf('Sample Correlation. \n Centered mean m=%d, p=%d',m,p)); set(gca,'fontsize',18)
figure;
hist_norm = 1;
hist(triu(Shat_centered,1),100,hist_norm);ylim([0 hist_norm*1.1]); xlim([-.5 .5]);
title('Histogram of off-diagonal correlations')
Default simuluation now adds a random offset for each measurement. This can be modified by setting rsignal
to some non-zero value.
% Simulate row uncentered two-way multivariate data
% Vary amount of the row offset by changing signal. Default signal = 1;
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
rsignal = 2; csignal = 0; rand('seed',0);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% %Constant offset for each feature
% row_offset = ones(m,1)*.25*sqrt(signal);
%
% Random offset for each observation
row_offset = rsignal + randn(m,1)*.25*sqrt(rsignal);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Random offset for each feature. Default 0.
mu = csignal + randn(1,p)*.25*sqrt(csignal);
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
rncen_mean = @(mu)(repmat(mu,[m 1])+repmat(row_offset,[1 p]));
rncen_mvnsim = @(mu,sigma)(rncen_mean(mu) + randn(m,p)*sqrtm(sigma));
Xnc = rncen_mvnsim(mu,sigma);
disp('Center the features (columns or brain regions or voxels)')
Xnc = bsxfun(@minus,Xnc,mean(Xnc));
mu_c_hat = mean(Xnc);
mu_r_hat = mean(Xnc');
disp('.....')
disp('Mean signal corresponding to Regions. Region 1, ...., Region 5')
mu_c_hat(1:5)
disp('Mean "brain wide" signal at each measurement due to measurement error. Observation 1, ...., Observation 5')
mu_r_hat(1:5)
Though all population correlation coefficients are zero 0. Sample correlation coefficients are statistically biased due to uncentered measurements. Excercise: Try increasing the number of measurements $m$.
Xnc = rncen_mvnsim(mu,sigma);
figure;
subplot(1,2,1);
imagesc(rncen_mean(mu)); colorbar;
title('Row uncentered, Mean')
subplot(1,2,2);
Shat_ncentered = corr(Xnc);
imagesc(Shat_ncentered); colormap(summer); colorbar; axis image equal;
title(sprintf('Sample Correlation. \n Uncentered mean m=%d, p=%d',m,p)); set(gca,'fontsize',18)
figure;
hist_norm = 1;
hist(triu(Shat_ncentered,1),100,hist_norm);ylim([0 hist_norm*1.1]); xlim([-.5 .5]);
title('Histogram of off-diagonal correlations')