Tutorial 2, Part 3 - Site Effects in ABIDE

In [1]:
%Add path
run('setup.m');
datadir = fullfile(basedir,'data');
filename = 'ABIDE_controlData_7Sites';
load(fullfile(datadir,filename));
load(fullfile(datadir,'cc200Map2Communities'))
max_recursion_depth(1024)

Exercise - Does global signal induced nuisance covariance matrix show site effects?

  1. Estimate nuisance covariance matrix using mean ROI signal as nuisance covariate
  2. Apply site effect statistic on nuisance matrices
In [2]:
disp('Fields in dataset')
fieldnames(data{1})

disp('signals contains subjects x brain regions x time-series')
disp(size(data{1}.signals))

namefield = 'dataName';
results = {};
for dataset=[1:7]
    disp(data{dataset}.(namefield))
    nsubjects = size(data{dataset}.signals,1);
    denoised_corr = []; 
    nuisance_corr = []; 
    nsr = [];
    for subjectno=1:nsubjects
        X = squeeze(data{dataset}.signals(subjectno,:,:))';
        Y = mean(X,2);
        [Sigma output] = covariance.sample_conditional_correlation(X, ...
                                    struct('verbose',false), ...	
                                    'nuisance',Y);
        denoised_corr(subjectno,:,:) = Sigma;
        nuisance_corr(subjectno,:,:) = output.nCov; 
        nsr(subjectno) = output.NSR;
        if(mod(subjectno,10)==0)
            disp(['Subject ...' num2str(subjectno)]);
            disp('Nuisance to Signal Ratio')
            disp(real(output.NSR))
        end
    end
    results{dataset}.denoised_corr = denoised_corr; 
    results{dataset}.nuisance_corr = nuisance_corr;
    results{dataset}.nsr = nsr;
end
Fields in dataset
ans = 
{
  [1,1] = signals
  [2,1] = subIDs
  [3,1] = age
  [4,1] = gender
  [5,1] = dataName
}
signals contains subjects x brain regions x time-series
    15   197   246
sitedata_LEUVEN_1
Subject ...10
Nuisance to Signal Ratio
 0.36805
sitedata_LEUVEN_2
Subject ...10
Nuisance to Signal Ratio
 0.19689
sitedata_NYU
Subject ...10
Nuisance to Signal Ratio
 0.21524
Subject ...20
Nuisance to Signal Ratio
 0.22247
Subject ...30
Nuisance to Signal Ratio
 0.30775
Subject ...40
Nuisance to Signal Ratio
 0.48545
Subject ...50
Nuisance to Signal Ratio
 0.28975
Subject ...60
Nuisance to Signal Ratio
 0.27000
Subject ...70
Nuisance to Signal Ratio
 0.24597
Subject ...80
Nuisance to Signal Ratio
 0.18755
Subject ...90
Nuisance to Signal Ratio
 0.23933
sitedata_UCLA_1
Subject ...10
Nuisance to Signal Ratio
 0.37074
Subject ...20
Nuisance to Signal Ratio
 0.26881
sitedata_UM_1
Subject ...10
Nuisance to Signal Ratio
 0.24623
Subject ...20
Nuisance to Signal Ratio
 0.28497
Subject ...30
Nuisance to Signal Ratio
 0.38094
Subject ...40
Nuisance to Signal Ratio
 0.38663
sitedata_USM
Subject ...10
Nuisance to Signal Ratio
 0.26280
Subject ...20
Nuisance to Signal Ratio
 0.24663
sitedata_YALE
Subject ...10
Nuisance to Signal Ratio
 0.23785
Subject ...20
Nuisance to Signal Ratio
 0.16400
In [3]:
concat_corr = @(corr1,corr2)(real(cat(1,corr1,corr2)));

sitecluster = []; 
dataset = [1:5];
currtotal = 0;
denoised_corr = [];
nuisance_corr = [];
nsr = [];
for ii=1:length(dataset)
    switch dataset(ii)
      case 1
      clusterval = 1;
      case 2
      clusterval = 1;
      case 3
      clusterval = 3;
      case 4
      clusterval = 4;
      case 5
      clusterval = 5;
      case 6
      clusterval = 6;
      case 7
      clusterval = 7;
    end
    sitecluster(currtotal+1:currtotal+size(data{dataset(ii)}.signals,1)) = clusterval;
    currtotal = currtotal+size(data{dataset(ii)}.signals,1);
  
    denoised_corr = concat_corr(denoised_corr,results{dataset(ii)}.('denoised_corr'));
    nuisance_corr = concat_corr(nuisance_corr,results{dataset(ii)}.('nuisance_corr'));
    nsr = concat_corr(nsr,results{dataset(ii)}.('nsr')'); 
end
  
reshape_corr = @(corr)(reshape(corr,[size(corr,1) size(corr,2)*size(corr,3)]));
denoised_corr = reshape_corr(denoised_corr);
nuisance_corr = reshape_corr(nuisance_corr);
figure;
subplot(1,2,1); imagesc(corr(denoised_corr')); colorbar; axis equal image;
subplot(1,2,2); imagesc(corr(nuisance_corr')); colorbar; axis equal image;


% Apply site detection statistic
[hom_com sep_com mw_com] = compareWithinAndBetweenGroupsSim(corr((denoised_corr+nuisance_corr)'), sitecluster)
communityAgrmntScore=mw_com.stats.zval;

[hom_com sep_com mw_com] = compareWithinAndBetweenGroupsSim(corr(denoised_corr'), sitecluster)
communityAgrmntScore=mw_com.stats.zval;

% Apply site detection statistic
[hom_com sep_com mw_com] = compareWithinAndBetweenGroupsSim(corr(nuisance_corr'), sitecluster)
communityAgrmntScore=mw_com.stats.zval;
MANN-WHITNEY-WILCOXON TEST
---------------------------------------------------------------------------
				Group 1		Group 2
numerosity			14447		6874
Sum of Ranks (W)		145565242.0		81737939.0
Mean rank			10075.8		11890.9
Test variable (U)		58108564.0		41200114.0
---------------------------------------------------------------------------
Sample size is large enough to use the normal distribution approximation
Mean					49654339.0
Standard deviation			420065.4346
Z corrected for continuity	20.1260		20.1260
p-value (1-tailed)			0.00000
p-value (2-tailed)			0.00000
---------------------------------------------------------------------------
 
hom_com =  0.39010
sep_com =  0.36553
mw_com =

  scalar structure containing the fields:

    stats =

      scalar structure containing the fields:

        zval =  20.126
        ranksum =  145565242

    p = 0
    h =  1

MANN-WHITNEY-WILCOXON TEST
---------------------------------------------------------------------------
				Group 1		Group 2
numerosity			14447		6874
Sum of Ranks (W)		147030614.0		80272567.0
Mean rank			10177.2		11677.7
Test variable (U)		56643192.0		42665486.0
---------------------------------------------------------------------------
Sample size is large enough to use the normal distribution approximation
Mean					49654339.0
Standard deviation			420065.4346
Z corrected for continuity	16.6375		16.6375
p-value (1-tailed)			0.00000
p-value (2-tailed)			0.00000
---------------------------------------------------------------------------
 
hom_com =  0.43814
sep_com =  0.41676
mw_com =

  scalar structure containing the fields:

    stats =

      scalar structure containing the fields:

        zval =  16.638
        ranksum =  147030614

    p = 0
    h =  1

MANN-WHITNEY-WILCOXON TEST
---------------------------------------------------------------------------
				Group 1		Group 2
numerosity			14447		6874
Sum of Ranks (W)		147870802.0		79432379.0
Mean rank			10235.4		11555.5
Test variable (U)		55803004.0		43505674.0
---------------------------------------------------------------------------
Sample size is large enough to use the normal distribution approximation
Mean					49654339.0
Standard deviation			420065.4346
Z corrected for continuity	14.6374		14.6374
p-value (1-tailed)			0.00000
p-value (2-tailed)			0.00000
---------------------------------------------------------------------------
 
hom_com =  0.23702
sep_com =  0.20667
mw_com =

  scalar structure containing the fields:

    stats =

      scalar structure containing the fields:

        zval =  14.637
        ranksum =  147870802

    p = 0
    h =  1

Out[3]:
In [4]:
% Plot NSR by Site
figure; 
plot(sitecluster,nsr,'o'); 
xlim([-.5 length(data)+1]); 
xlabel('Site'); 
ylabel('Nuisance to Signal Ratio')
Out[4]:

Reduction in site effects using conditional correlation

Nuisance covariates used are principal components from shared within-site variation

In [7]:
results = analyze_abide_siteeffects('within_site')
ans = 0
Site sitedata_LEUVEN_1, Processing Subject 5 ...
Site sitedata_LEUVEN_1, Processing Subject 10 ...
Site sitedata_LEUVEN_1, Processing Subject 15 ...
Site sitedata_LEUVEN_2, Processing Subject 5 ...
Site sitedata_LEUVEN_2, Processing Subject 10 ...
Site sitedata_LEUVEN_2, Processing Subject 15 ...
Site sitedata_UCLA_1, Processing Subject 5 ...
Site sitedata_UCLA_1, Processing Subject 10 ...
Site sitedata_UCLA_1, Processing Subject 15 ...
Site sitedata_UCLA_1, Processing Subject 20 ...
Site sitedata_UCLA_1, Processing Subject 25 ...
Site sitedata_USM, Processing Subject 5 ...
Site sitedata_USM, Processing Subject 10 ...
Site sitedata_USM, Processing Subject 15 ...
Site sitedata_USM, Processing Subject 20 ...
Site sitedata_YALE, Processing Subject 5 ...
Site sitedata_YALE, Processing Subject 10 ...
Site sitedata_YALE, Processing Subject 15 ...
Site sitedata_YALE, Processing Subject 20 ...
Site sitedata_YALE, Processing Subject 25 ...
MANN-WHITNEY-WILCOXON TEST
---------------------------------------------------------------------------
				Group 1		Group 2
numerosity			4523		1148
Sum of Ranks (W)		12293354.0		3789602.0
Mean rank			2718.0		3301.0
Test variable (U)		3130076.0		2062328.0
---------------------------------------------------------------------------
Sample size is large enough to use the normal distribution approximation
Mean					2596202.0
Standard deviation			49540.6529
Z corrected for continuity	10.7765		10.7765
p-value (1-tailed)			0.00000
p-value (2-tailed)			0.00000
---------------------------------------------------------------------------
 
MANN-WHITNEY-WILCOXON TEST
---------------------------------------------------------------------------
				Group 1		Group 2
numerosity			4523		1148
Sum of Ranks (W)		12265868.0		3817088.0
Mean rank			2711.9		3325.0
Test variable (U)		3157562.0		2034842.0
---------------------------------------------------------------------------
Sample size is large enough to use the normal distribution approximation
Mean					2596202.0
Standard deviation			49540.6529
Z corrected for continuity	11.3313		11.3313
p-value (1-tailed)			0.00000
p-value (2-tailed)			0.00000
---------------------------------------------------------------------------
 
MANN-WHITNEY-WILCOXON TEST
---------------------------------------------------------------------------
				Group 1		Group 2
numerosity			4523		1148
Sum of Ranks (W)		12486072.0		3596884.0
Mean rank			2760.6		3133.2
Test variable (U)		2937358.0		2255046.0
---------------------------------------------------------------------------
Sample size is large enough to use the normal distribution approximation
Mean					2596202.0
Standard deviation			49540.6529
Z corrected for continuity	6.8864		6.8864
p-value (1-tailed)			0.00000
p-value (2-tailed)			0.00000
---------------------------------------------------------------------------
 
GPL Ghostscript 9.16: **** Could not open the file 'tmp/22-Jun-2017-2208/analyze_abide_siteeffects1.png'.
GPL Ghostscript 9.16: Unrecoverable error, exit code 1
results =

  1x7 struct array containing the fields:

    SigXX
    SigXY
    SigYY
    nCov
    Sigma
    SigmaCov
    nCorr
    NSR
    SNR
    Y
    X_perpY
    denoised
    nuisance
    nsr

Out[7]:

Exercise: In some sites the nuisance variation in global signal sometimes contains desired BOLD signal. Find which one?

Modify line 33 to change the set of sites used in the analysis and then re-run results = analyze_abide_siteeffects('within_site') You will find that one of the sites is very different from the other sites. Why?