Multi-site imaging studies are becoming increasingly common.
Combining imaging data across sites introduces non-biological sources of variation that arise from the use of different scanner hardware and acquisition protocols.
E.g., field strength, manufacturer, subject positioning
Scanner effects or site effects are similar to batch effects in the genomics literature
Known to affect measurement of regional volumes, cortical thickness, voxel-based morphometry, …
More generally, structural, functional, diffusion tensor, and other types of images and features extracted from them may exhibit scanner effects
Need to eliminate or account for scanner effects in downstream statistical analyses
Most critical if sites or scanners are imbalanced with respect to other variables such as age, sex, race, clinical status
Simply including scanner as a confounding variable may not work well (Rao et al. 2017)
Several methods for estimating and removing unwanted sources of variation due to site/scanner have been adapted to neuroimaging data.
In this tutorial we will use ComBat (Johnson, Li, and Rabinovic 2007; Fortin et al. 2018) to harmonize cortical thicknesses from the ADNI data.
ComBat has been shown to effectively reduce scanner-to-scanner variability while preserving biological associations.
ADNI Cortical Thickness Data
The Alzheimer’s Disease Neuroimaging Initiative (ADNI) is a multi-million dollar study funded by public and private sources.
National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, the Food and Drug Administration, private pharmaceutical companies, and non-profit organizations.
The goals of the ADNI are to better understand progression from normal aging to mild cognitive impairment (MCI) and early Alzheimer’s disease (AD) and determine effective biomarkers for disease diagnosis, monitoring, and treatment development.
We estimated cortical thicknesses from a subset of initial subject visits (N=187)
Mix of male/female aged 56-91
Mix of healthy controls (46), MCI (93), and AD (48) diagnoses at baseline
Our subset consists of images aquired from scanners at 17 different sites