Comparison of CSF markers and semi-quantitative amyloid PET in Alzheimer’s disease diagnosis and in cognitive impairment prognosis using the ADNI-2 database

Background The relative performance of semi-quantitative amyloid positron emission tomography (PET) and cerebrospinal fluid (CSF) markers in diagnosing Alzheimer’s disease (AD) and predicting the cognitive evolution of patients with mild cognitive impairment (MCI) is still debated. Methods Subjects from the Alzheimer’s Disease Neuroimaging Initiative 2 with complete baseline cognitive assessment (Mini Mental State Examination, Clinical Dementia Rating [CDR] and Alzheimer’s Disease Assessment Scale–Cognitive Subscale [ADAS-cog] scores), CSF collection (amyloid-β1–42 [Aβ], tau and phosphorylated tau) and 18F-florbetapir scans were included in our cross-sectional cohort. Among these, patients with MCI or substantial memory complaints constituted our longitudinal cohort and were followed for 30 ± 16 months. PET amyloid deposition was quantified using relative retention indices (standardised uptake value ratio [SUVr]) with respect to pontine, cerebellar and composite reference regions. Diagnostic and prognostic performance based on PET and CSF was evaluated using ROC analysis, multivariate linear regression and survival analysis with the Cox proportional hazards model. Results The cross-sectional study included 677 participants and revealed that pontine and composite SUVr values were better classifiers (AUC 0.88, diagnostic accuracy 85%) than CSF markers (AUC 0.83 and 0.85, accuracy 80% and 75%, for Aβ and tau, respectively). SUVr was a strong independent determinant of cognition in multivariate regression, whereas Aβ was not; tau was also a determinant, but to a lesser degree. Among the 396 patients from the longitudinal study, 82 (21%) converted to AD within 22 ± 13 months. Optimal SUVr thresholds to differentiate AD converters were quite similar to those of the cross-sectional study. Composite SUVr was the best AD classifier (AUC 0.86, sensitivity 88%, specificity 81%). In multivariate regression, baseline cognition (CDR and ADAS-cog) was the main predictor of subsequent cognitive decline. Pontine and composite SUVr were moderate but independent predictors of final status and CDR/ADAS-cog progression rate, whereas baseline CSF markers had a marginal influence. The adjusted HRs for AD conversion were 3.8 (p = 0.01) for PET profile, 1.2 (p = ns) for Aβ profile and 1.8 (p = 0.03) for tau profile. Conclusions Semi-quantitative amyloid PET appears more powerful than CSF markers for AD grading and MCI prognosis in terms of cognitive decline and AD conversion.


Background
Mild cognitive impairment (MCI) refers to cognitive deficits that do not directly impact the activities of daily living [1] and may be related to varied aetiologies, including depression, dementia and cerebrovascular disease. Only a small proportion of patients with MCI will convert to Alzheimer's disease (AD) within a given period of time, whereas the others will incur a variable cognitive decline or even revert to normal [2]. Considerable effort has been devoted to identifying and developing reliable biomarkers of incipient AD to target the individuals who would most benefit from early treatment intervention [3]. Decreased cerebrospinal fluid (CSF) concentration of the amyloid-β 1-42 peptide (Aβ) and an increased level of the protein tau are seen in patients with AD [4,5]. This pathological CSF signature is a key feature in AD diagnosis, and the CSF profile, potentially combined with neuroimaging data [6][7][8][9], has the ability to predict cognitive decline and conversion to AD independently of established risk factors such as age, sex and apolipoprotein E (ApoE) genotype [10][11][12][13].
Positron emission tomography (PET) using 11 C-labelled Pittsburgh Compound B (PiB) or fluorinated tracers such as 18 F-florbetapir allows in vivo visualisation and quantification of cortical Aβ deposition with high sensitivity and specificity compared with amyloid plaque burden at autopsy [14,15]. Therefore, amyloid PET was included as a pathophysiological marker in the most recent international working group diagnostic criteria [16]. Although standard interpretation relies on visual assessment, semiquantitative measures of cortical retention with respect to a reference subcortical region is expected to provide refined evaluation of the amyloid burden with high testretest reliability [17]. Historically, normalisation of standardised uptake value (SUV) has been done using the brainstem, pons or whole cerebellum as the reference region. However, there is growing evidence that composite reference regions that include some subcortical white matter induce less temporal variability in sequential measurements, yielding higher accuracy in assessing subtle time changes and greater power to detect Aβ accumulation [18][19][20]. Researchers in several studies have reported the capacity of amyloid PET using fluorinated tracers (either visual [21], semi-quantitative [22,23] or both [24]) to provide prognostic insight regarding cognitive decline and conversion to AD in patients with MCI, in line with previous evidence of the prognostic value of PiB PET [25][26][27][28][29]. A recent multi-centre study demonstrated the clinical impact of florbetapir PET in terms of diagnostic confidence and drug treatment [30].
Although CSF and PET measures of Aβ deposition are highly correlated [31][32][33][34], the comparative relevance of these two markers in discriminating patients with AD and predicting cognitive outcome in patients with MCI is still under debate [35]. Hake et al. showed that CSF and PET profiles were both discriminant in classifying healthy control subjects and patients with MCI vs patients with AD [36]. Palqvist et al. found that the PET standardised uptake value ratio (SUVr) was associated with disease stage (cognition, memory and hippocampal volume) in patients with MCI, whereas CSF markers were not [37]. In recent studies, researchers concluded that CSF analysis might detect Aβ deposition earlier than PET [38] and that reduced CSF Aβ might relate more to early-stage AD, whereas the amyloid load assessed by PET is indicative of disease progression [39].
Schreiber et al. demonstrated that baseline florbetapir PET, rated either visually or using a cerebellar SUVr, was predictive of conversion to AD in a large longitudinal cohort [24]. The prognostic value of the baseline PET profile with respect to subsequent cognitive evolution was also highlighted, consistent with prior results derived from a retrospective study [22]. Yet, the exact added diagnostic and prognostic value of amyloid PET semi-quantitative indices compared with CSF markers is still unclear, and, relatedly, the optimal reference region for SUVr computation remains to be defined. In the present study, we systematically compared baseline CSF markers and PET semi-quantitative indices in terms of diagnostic value regarding baseline cognitive status, as well as prognostic value in patients with MCI regarding cognitive decline and conversion to AD. In addition, we evaluated the performance of the SUVr computed using various well-established subcortical reference regions.

Subjects
In this study, we used participant data from the Alzheimer's Disease Neuroimaging Initiative (ADNI), a multicentre project with approximately 50 medical centres and university sites across the United States and Canada [40]. The ADNI was launched in 2003 as a public-private partnership led by Principal Investigator Michael W. Weiner, MD. Its primary goal was to examine how brain imaging and other biomarkers can be used to measure the progression of MCI and early AD. Determination of sensitive and specific markers of very early AD progression is expected to help researchers and clinicians develop new treatments and monitor their effectiveness, as well as lessen the time and cost of clinical trials. A detailed description of the inclusion criteria can be found on the ADNI webpage (http://www.adni-info.org). Subjects were between 55 and 90 years old and willing and able to undergo all test procedures, including neuroimaging, and had agreed to undergo longitudinal follow-up.
Cognitively normal participants were the control subjects in the ADNI study. They showed no signs of depression, MCI or dementia. Participants with significant memory complaint (SMC) scored within the normal range for cognition but indicated concerns and exhibited slight forgetfulness. Early and late MCI participants reported an SMC either autonomously or via an informant or clinician. However, other cognitive domains showed no significant impairment, activities of daily living were preserved, and there were no signs of dementia. Participants with AD met the National Institute of Neurological and Communicative Disorders and Stroke/Alzheimer's Disease and Related Disorders Association criteria for probable AD [41,42].
Data were downloaded from the ADNI database (adni.loni.usc.edu) and included all subjects recruited in the ADNI-2 with complete available baseline data regarding cognitive assessment, CSF markers and PET Aβ quantitation. Our cross-sectional sample was made up of 677 subjects (157 control subjects, 95 with SMC, 301 with MCI among whom 153 had early MCI and 148 had late MCI, and 124 with AD at the time of the florbetapir scan; see Table 1) who were recruited between January 2011 and September 2013, and each had a baseline CSF collection and florbetapir session. The time delay between the lumbar puncture and the florbetapir PET was 11 ± 18 days. Our longitudinal sample was made up of the 396 subjects with SMC and MCI from the crosssectional sample who had undergone an average clinical follow-up of 30 ± 16 months (see Table 2). Baseline visit and follow-up visits at 3, 6 and 12 months, then yearly, included complete cognitive assessment using the Geriatric Depression Scale, Mini Mental State Examination (MMSE), Clinical Dementia Rating (CDR) and Alzheimer's Disease Assessment Scale-Cognitive Subscale (ADAS-cog). Diagnostic status and cognitive scores were extracted from the latest available dataset ('DXSUM_PDXCONV_ ADNIALL.csv'). For each participant in the longitudinal cohort, the mean annual change in cognitive scores was computed by taking the difference between the last  cognitive evaluation and the baseline one and dividing by the time range. The last known diagnostic status was the one mentioned at the time of the last visit listed in the dataset. For each participant of the longitudinal cohort for whom the last status was AD, time to conversion was computed as the delay between the baseline visit and the first visit mentioning an AD status.

CSF markers
Baseline Aβ 1-42 , total tau and phosphorylated p-tau 181 (p-tau) were measured using the multiplex xMAP Luminex platform (Luminex Corp., Austin, TX, USA) with the INNO-BIA AlzBio3 kit (Innogenetics, Ghent, Belgium) [5,43]. For this study, we used the archived dataset 'UPENNBIOMK_MASTER.csv'. When multiple baseline CSF marker dosages were available, the median value was retained for subsequent analyses. The studied variables of CSF biomarker were Aβ, tau, p-tau and the p-tau/Aβ ratio. Additional analysis details and quality control procedures appear on the ADNI website.

Amyloid PET data
Baseline Aβ deposition was visualised using 18 F-florbetapir PET. Semi-quantitative PET results were retrieved from the latest available dataset ('UCBERKELEYAV45_10_ 17_16.csv'). The methods for PET acquisition and analysis are described in more detail elsewhere [22,44]. Florbetapir images consisted of 4 × 5-minute frames acquired at 50-70 minutes after injection, which were realigned, averaged, resliced to a common voxel size (1.5 mm) and smoothed to a common resolution of 8 mm in full width at halfmaximum [45]. Structural T1-weighted images acquired concurrently with the baseline florbetapir images were used as a structural template to define the cortical regions of interest and the reference regions in native space for each subject, using FreeSurfer (version 4.5.0; surfer.nmr.mgh.harvard.edu) as described elsewhere [44]. Baseline florbetapir scans for each subject were coregistered to baseline structural magnetic resonance imaging scans, which were subsequently used to extract weighted cortical retention indices (SUV) from grey matter within four large cortical regions of interest (frontal, cingulate, parietal and temporal cortices) that were averaged to create a mean cortical SUV as described in greater detail online (adni.bitbucket.org/docs/UCBERKE LEYAV45/UCBERKELEY_AV45_Methods_12.03.15.pdf ).
Cortical SUVr values were obtained by normalising cortical SUV with the mean uptake in a subcortical reference region. For the present study, candidate reference regions were pons, whole cerebellum and a composite region made up of the whole cerebellum, pons and eroded subcortical white matter [19]. In the sequel, the corresponding SUVr will be respectively referred to as pontine SUVr, cerebellar SUVr and composite SUVr.

Statistical analyses
Continuous variables are presented as mean ± SD and categorical variables as number (percent). The diagnostic performance of CSF markers and SUVr was assessed through ROC analysis. For each parameter and each cutoff value, sensitivity was defined as the positivity rate in the patients with AD and specificity as the negativity rate in the control subjects/normal patients. The optimal cut-off value was that maximising Youden's index (sensitivity + specificity − 1). The concordance between PET profile based on SUVr values and CSF profile was evaluated using Cohen's kappa coefficient.
To test the association of baseline SUVr and CSF markers with diagnosis and prognosis, a multivariate analysis was conducted using a stepwise linear regression model with an entry criterion of p < 0.05 and a removal criterion of p > 0.1. To identify the independent determinants of baseline status and baseline cognition (MMSE, CDR and ADAS-cog), the following explicative factors were included in the model: sex, age, ApoE4 status, the four CSF variables and SUVr. To identify the independent predictors of final status, cognitive decline (annual change in MMSE, CDR and ADAS-cog) and time to conversion, the following explicative factors were included in the model: sex, age, ApoE4 status, baseline cognitive scores, the four CSF variables and SUVr. Categorical variables (sex, ApoE4 status, baseline and final status) were discretised, whereas (pseudo-)continuous variables (age, cognitive scores, CSF markers and SUVr) were processed as such. In each model, the three SUVr values based on the three candidate reference regions were tested separately, then jointly.
The correlation between baseline SUVr and cognitive score evolution was evaluated using least-squares quadratic regression and Spearman's rank correlation. The statistical significance of the mean annual changes in cognitive scores was tested using a z-test.
The predictive value of baseline PET and CSF profiles regarding conversion to AD was assessed using Kaplan-Meier survival curves and the log-rank test. HRs were adjusted using a Cox proportional hazards model including the following explanatory covariates: sex, age, ApoE4 status, baseline cognitive scores, PET profile, CSF Aβ and tau profiles. For patients who did not convert to AD, survival data were considered censored from the time of the last visit on record.
A two-sided p value ≤0.05 was considered statistically significant. As regards the multivariate analysis, p values were corrected for multiple comparisons using the Dunn-Šidák correction: p corrected = 1 − (1 − p) m , with m being the number of comparisons (here we set m = 9 as the number of times the linear model was run). All statistical computations were performed using MATLAB R2013 software (MathWorks, Natick, MA, USA). Figure 1 presents the patient flow diagram. For the cross-sectional cohort, the patient demographics, ApoE4 status, baseline cognitive scores and CSF markers are detailed in Table 1. The differences between control subjects and patients with MCI and between patients with MCI and patients with AD were highly significant for ApoE4 status, cognition and all four CSF markers. For the longitudinal cohort, the patient demographics, ApoE4 status, baseline cognition and CSF, annual change in cognitive scores during follow-up and time to conversion are detailed in Table 2. Of the 396 patients with SMC/MCI at baseline, 209 (53%) were classified as having MCI at their last visit, 105 (27%) were ranked as normal (mostly patients with baseline SMC, and 19 patients with baseline MCI who reverted to normal) and 82 (21%) converted to AD (1 SMC, 19 early MCI and 62 late MCI). The differences in baseline cognition and CSF markers were highly significant between normal subjects and patients with MCI and between patients with MCI and patients with AD. Cognitive decline was similar in normal subjects and patients with MCI and markedly greater in patients with AD. Figure 2 shows the distribution of (from left to right) pontine, cerebellar and composite SUVr values in the cross-sectional and longitudinal cohorts. In both cohorts, SUVr values were significantly lower in normal patients than in patients with MCI and in patients with MCI than in patients with AD, whatever the reference region used. No difference was found between the homologous subsets of normal and patients with AD from the two cohorts. Table 3 details the results of the ROC analyses for SUVr and CSF markers. Sensitivity and specificity stand for, respectively, the rate of true-positives among patients with AD and the rate of true-negatives among control subjects/normal patients. Optimal cut-off values for SUVr were highly similar in the cross-sectional and longitudinal cohorts, whereas they differed substantially for the CSF markers. SUVr performances were globally higher than those of CSF markers. In both cohorts, the best diagnostic performance was achieved using composite SUVr with an AUROC above 0.85, a sensitivity above 85% and a specificity above 80% in cross-sectional and longitudinal analyses. Overall, the predictive power of SUVr was superior to that of CSF markers, with risk ratios for evolving to AD ranging from 7 to 9.5 (vs 4.5 to 8 for CSF markers). Figure 3 shows the frequencies of final status in the longitudinal cohort according to baseline PET (composite SUVr) and baseline CSF profile (Aβ/tau combination). Seventy-two percent of the patients had concordant Aβ/tau profiles (43% negative, 29% positive), and 28% had discordant Aβ/tau profiles (25% Aβ + /tau − and 3% Aβ − /tau + ). There was no significant difference in mean follow-up duration between negative and positive profiles (PET, Aβ or tau).

Results
The concordance between the PET and CSF profiles was good when SUVr was compared with Aβ (kappa > 0.8) and moderate when it was compared with tau and p-tau (kappa around 0.6-0.7), without substantial variation related to the chosen reference region (see Table 4 for details).
Tables 5 and 6 summarise the results of the multivariate analyses. SUVr p values reported in the tables are those obtained when the three SUVr values were evaluated separately. An asterisk designates the p values that remained significant when the three SUVr values were evaluated jointly. The coefficients of determination (r 2 ) reflect the proportion of the variance in the modelled variable that is predictable from each explanatory variable retained in the model. Regarding the cross-sectional cohort (Table 5), sex, tau level and SUVr were independent determinants of baseline status and cognitive scores (all corrected p values <0.001), whereas ApoE4 status and other CSF variables were not. The best determinants were pontine and composite SUVr, which showed similarly high association with patient status and cognitive level. For the longitudinal cohort (Table 6), baseline cognition (MMSE, CDR and ADAS-cog) was the main predictor of cognitive decline in terms of final status and annual deterioration in cognitive scores. In patients with MCI who converted to AD during follow-up (n = 82), baseline ADAS-cog score    Figure 5 presents the Kaplan-Meier curves for conversion to AD in patients with SMC/MCI according to baseline PET (composite SUVr) and CSF profiles. The Cox proportional hazards model shows that baseline ADAS-cog score was the strongest predictor for AD conversion (p < 10 −8 ). A positive baseline PET was associated with an adjusted HR of 3.8 for AD conversion (p = 0.01). CSF Aβ and tau were less predictive with adjusted HRs of 1.2 (not significant) and 1.8 (p = 0.03), respectively.

Discussion
In this study based on prospective data from the ADNI-2 cohort, we examined the complementary diagnostic and prognostic value of baseline CSF markers and 18 F-florbetapir SUVr values computed using three different reference regions. We found that PET semiquantitative assessment of Aβ load was significantly superior, although CSF and PET markers were both relevant determinants of cognitive status and predictive of cognition decline in patients with MCI. Notably, as can be seen in Fig. 2 and Table 3, baseline SUVr distribution was similar in patients with baseline AD and patients with SMC/MCI who converted to AD during follow-up; hence, the optimal SUVr cut-offs to differentiate patients with AD from normal subjects were nearly identical in the crosssectional and longitudinal cohorts. The optimal cut-offs for CSF markers were less robust, suggesting that PET quantitation might be preferable for accurate selection and therapeutic monitoring of individuals in clinical trials [46].
Our optimal cerebellar SUVr cut-off (1.22) was consistent with that proposed by Fleisher et al. (1.17), based on post-mortem neuropathological data [47]. A less conservative SUVr cut-off was proposed by Joshi et al. [17] as the upper bound of a one-tailed 95% CI of cerebellar SUVr distribution in young healthy control subjects, and it was used in other studies [22,24] as a positivity threshold for florbetapir PET. Such a low threshold based on young control subjects seems questionable, however, and may result in poor specificity (about 70% in the study by Landau et al. [22]), given that significant amyloid deposition without cognitive impairment is seen in 20% to 40% of normal elderly volunteers [14,48]. To our knowledge, this is the first attempt to provide optimal thresholds for pontine and composite SUVr,    because recent studies involving extra-cerebellar reference regions have been aimed primarily at assessing the longitudinal accuracy of SUVr estimates [18,19].
Regarding the CSF ROC analyses, our optimal Aβ cutoff to differentiate patients with AD from normal control subjects (157 ng/L) was similar to that obtained by De Meyer et al. (159 ng/L) based on the ADNI-1 cohort [12]. Our optimal CSF Aβ cut-off to predict conversion to AD in the longitudinal analysis (171 ng/L) was closest to that proposed by Shaw et al. (192 ng/L) with reference to  autopsy data [5], yielding comparable sensitivity and negative predictive value (respectively, 90% and 96% vs 96% and 95%). Our optimal cut-off for CSF tau (88 ng/L) was also similar to that mentioned by Shaw (93 ng/L) [5]. The proportion of concordant CSF profiles in terms of Aβ and tau was 72% in both cohorts, concordant with the 73% of concordant profiles reported by Sunderland et al. [4] in a cohort of patients with AD and control subjects.
In our cross-sectional cohort, the first interesting finding was that PET SUVr clearly outperformed CSF markers in determining patients' cognitive status, as evaluated in a multivariate model. Its diagnostic accuracy neighboured 85% in differentiating patients with AD from control subjects, and cognitive performance (MMSE, CDR and ADAS-cog) was significantly associated with pontine and composite SUVr in the whole population. The higher diagnostic performance of pontine and composite SUVr than cerebellar SUVr might be related to a lower signal-to-noise ratio in the cerebellum, leading to less accurate and more variable SUV measurements in this region. Researchers in previous studies pointed out that pontine and cerebellar uptake was prone to noise and longitudinal variability owing to the small size of the considered regions and their peripheral location in the PET scanner field of view, and they advocated for the use of composite reference regions taking into account cerebral white matter [18,19].
The CSF markers showed lower diagnostic value in ROC analysis (lower AUC and lower accuracy of 80% for Aβ and 75% for tau), and total tau was the sole CSF marker to bring added diagnostic value. Palmqvist et al. [35] noted that 18 F-flutemetamol cerebellar SUVr was correlated with global cognition and hippocampal atrophy in patients with increased Aβ load, whereas CSF Aβ was not. These data are consistent with a commonly accepted model of AD pathological cascade, according to which Aβ deposition takes place at an early stage in the natural history of the disease and tau-mediated neuronal injury occurs secondarily [3]. Yet, although CSF Aβ reaches a plateau prior to the prodromal state, PET retention gradually increases during progression to AD [49]. Semi-quantitative amyloid PET may thus be more appropriate than CSF markers for early-stage grading of AD. To be fully operative and allow efficient discrimination between neurodegenerative diseases, it has to be integrated within the range of available biomarkers, including tau-specific PET tracers currently under clinical assessment [50].
The second original finding, which might have stronger practical implications, was that baseline PET SUVr was more predictive of clinical evolution and AD conversion than CSF markers and that baseline SUVr levels directly correlated with the subsequent rate of cognitive decline. Composite SUVr predictive accuracy regarding final status reached 84% compared with 79% for both CSF Aβ and tau. In line with prior reports, cognitive measures at baseline were the best predictors of cognitive evolution and AD conversion [51,52]. Baseline pontine and composite SUVr were moderate but significant predictors of final status and mean annual CDR and ADAS-cog change in multivariate analysis, whereas CSF markers had little or no impact on cognitive evolution. Cognitive decline as reflected by the mean annual changes in MMSE, CDR and ADAS-cog was significantly correlated with baseline composite SUVr. The mean annual changes in CDR and ADAS-cog were significant in patients with positive baseline PET, whereas patients with negative baseline PET did not incur significant CDR and ADAS-cog modification during follow-up (Fig. 4). Among patients with a negative baseline PET (rated using composite SUVr), 4% were AD converters, and among those with a positive PET scan, 42% were AD converters. This yielded an adjusted HR for AD conversion of 3.8 (p = 0.01). Notably, the PET profile appeared decisive in patient with discordant CSF markers (99 Aβ + /tau − and 13 Aβ − /tau + ). In these patients, an abnormal amyloid PET resulted in a five-fold increase in AD conversion risk (25% vs 5% in patients with a normal amyloid PET; see Fig. 3). It would seem that even in patients with a concordant positive CSF profile (Aβ + /tau + ), a negative PET is associated with a moderate risk of AD conversion (7% vs 55% in patients with a positive PET), though Aβ + /tau + /PET − profiles were too few to ensure sufficient statistical power. Patients with a negative PET profile who evolved to AD during follow-up might either correspond to PET false-negatives or to cases of non-amyloid dementias. The proportion of PET-positive patients who were ranked as normal during follow-up is consistent with previous evidence that 20% to 30% of cognitively normal elderly subjects harbour Aβ deposition [53].

Conclusions
Semi-quantitative amyloid PET and CSF markers yield complementary information for classifying normal subjects, patients with MCI and patients with AD. However, PET might be preferable for robust grading of early-stage AD, and cross-sectional cut-off values for SUVr seem to be directly transposable for longitudinal analysis. Amyloid PET quantification using a composite SUVr appears more powerful than CSF markers for MCI prognosis in terms of AD conversion, and progressive cognitive decline is correlated with baseline composite SUVr. In patients with an equivocal CSF profile, amyloid PET effectively differentiates patients with high risk of AD conversion.