Study design and participants
In this cross-sectional study, we employed baseline data from the Catch-Cog study, which is an international, multicenter, prospective cohort study. Participants (N = 184) were recruited via the (1) Alzheimer Center Amsterdam, Amsterdam UMC, location VU University Medical Center, The Netherlands (n = 102); (2) Alzheimer Center Erasmus Medical Center (EMC, n = 14), Rotterdam, The Netherlands; (3) University Medical Center Groningen (UMCG, n = 39), The Netherlands; and (4) the Centre for Dementia Prevention, Edinburgh, Scotland (n = 29). We recruited participants who met the research criteria for SCD [28], the clinical criteria for MCI due to AD [2], probable AD dementia [29], or DLB dementia [15]. Other inclusion criteria were (1) Mini-Mental State Examination (MMSE) score ≥ 18, (2) age ≥ 50, and (3) availability of a study partner who was able and willing to participate. Exclusion criteria were (1) presence of any other neurological disorder, (2) presence of a major psychiatric disorder such as severe personality disorder or depression (Geriatric Depression Scale score ≥ 6 [30]), (3) current abuse of alcohol and/or drugs, and (4) simultaneously participating in a clinical trial.
Before inclusion, participants had undergone a standard diagnostic work-up in their study center, including at least medical history, neurological examination, and cognitive assessment. Structural brain imaging was available for a subset of the study cohort. Diagnoses were performed during a multidisciplinary consensus meeting, containing at least a neurologist or psychiatrist and with neuropsychology input. In the UMCG, SCD, and MCI, participants were also recruited via advertisements in local newspapers. After responding to this advertisement, eligible participants were screened by a neuropsychologist and neurologist to investigate whether they met the criteria for SCD or MCI [28].
The Medical-Ethical Committee of the VU University Medical Center approved the study for all Dutch centers. The South East Scotland Research Ethics Committee approved the study for the Scottish site. All participants and study partners provided written and oral informed consent.
The cognitive-functional composite
Cognitive component
The cognitive test battery of the CFC included the three ADAS-Cog memory subscales Word Recognition, Word Recall and Orientation [3]; the Controlled Oral Word Association Test (COWAT); category fluency test (CFT); Digit Span Backward (DSB) and Digit Symbol Substitution Test (DSST) [31]. During the word recognition test, the participant is required to learn a list of 12 words and identify these words when mixed among 12 other distracter words (one point for each incorrect response, score range 0–12). During word recall, the participant is given three trials to learn a list of ten high-imagery nouns (total score entails the average number of words not recalled across the three trials, score range 0–10). The orientation subtest includes eight questions regarding the participant’s orientation to person, place, and time (one point for each incorrect response, score range 0–8). The COWAT assesses the participant’s phonemic fluency skills using the letters D-A-T in The Netherlands or F-A-S in English and a total time of 60 s per letter (one point for each correct non-repeated word). The CFT examines the participant’s semantic fluency by requiring them to generate as many exemplars of the category animals within 60 s (one point for each correct unique animal). The DSB requires the participant to reproduce sequences of digits of increasing length in the reversed order (score range 0–12). The DSST is a timed EF test during which participants have to substitute as many digits by unique geometric symbols within 90 s (one point for each correct substituted symbol).
Functional component
The functional component consisted of the short version of the A-IADL-Q [21]. The A-IADL-Q is a computerized, informant-based questionnaire covering a broad range of complex IADL [19]. The short version consists of 30 items covering household, administration, work, computer use, leisure time, appliances, and transport activities. For each item, difficulty in performance is rated on a 5-point Likert scale (ranging from “no difficulty in performing this task” to “no longer able to perform this task”). Scoring is based on item response theory, a paradigm linking item responses to an underlying latent trait [32]. This results in a latent trait score (z-score), reflecting one’s level of IADL functioning, with higher scores indicating better IADL functioning [21].
CFC scoring
To create CFC scores, the directionality of the three ADAS-Cog subtest scores were reversed, so that higher scores reflected better performance. Subsequently, all cognitive subtest scores were transformed into z-scores with total group mean and standard deviation (SD) as reference values. The cognitive composite was computed as a weighted z-score of all seven cognitive subtests, whereas the functional component score was the A-IADL-Q score. The overall CFC composite score was computed as a weighted z-score of the cognitive composite and A-IADL-Q, with higher scores indicating better performance.
Reference measures
Traditional tests of cognition and function
Traditional tests to compare the CFC with included the MMSE, ADAS-Cog-13, ADCS-ADL, and CDR-SB. The MMSE is a global cognitive screening test, with a total score ranging from 0 to 30 and higher scores reflecting better performance [33]. The ADAS-Cog-13 yields a measure of cognitive performance by combining ratings of 13 subtests (e.g., word lists recognition and recall, constructional praxis, object and finger naming). Total scores range from 0 to 85, with higher scores indicating more severe impairment [3]. The ADCS-ADL assesses the functional abilities affected in mild-to-moderate AD. For 23 different basic and instrumental activities, the levels of performance and independency during the past 4 weeks were rated by the study partner. Total scores range from 0 (non-performance or need for extensive help) to 78 (independent performance) [26]. The CDR has been developed for the staging of dementia severity. The participant’s cognitive and functional performance is rated in 6 areas: memory, orientation, judgment and problem solving, community affairs, home and hobbies, and personal care. Each area is rated as 0 (healthy), 0.5 (questionable dementia), 1 (mild dementia), 2 (moderate dementia), or 3 (severe dementia). Adding the rating of all boxes results in a total CDR-SB score ranging from 0 to 18, with higher scores reflecting severe dementia [25, 34]. The ADCOMS is a recently designed, statistically derived composite scoring procedure, consisting of two MMSE items (“orientation to time” and “copy design”), 4 ADAS-Cog subtests (delayed word recall, orientation, word recognition, and word recall) and all 6 CDR-SB subscores. All items are differentially weighted yielding a score ranging from 0 to 1.27 with higher scores implying greater impairment [27].
Reference measures of disease severity
Informant reports of disease severity included the Cognitive Function Instrument study partner version (CFI-SP) [35], Quality of Life in Alzheimer’s Disease (QoL-AD) [36], the short version of Zarit Burden Inventory (ZBI-12) [37], and the Apathy Evaluation Scale (AES) [38]. The CFI-SP includes 14 items on a decline in day-to-day cognitive and functional abilities compared to 1 year ago. Response options include “yes” (0), “no” (1), or “maybe” (0.5), with total scores ranging from 0 to 14. The QoL-AD consists of 13 items, rated on a 4-point scale. Total scores range from 13 to 52, with higher scores reflecting better quality of life. The ZBI is one of the most commonly used instruments for assessing the aspects of caregiver burden [37]. Each item was rated on a 5-point scale, with total scores ranging from 0 to 60 and higher scores suggesting greater caregiver burden. The AES consists of 18 statements about the participant’s thoughts, feelings, and activity, which are rated on a 4-point scale. Total scores range from 0 to 72, with higher scores indicating more severe apathy.
Magnetic resonance (MR) images were acquired locally at each center in 3 T scanners. A minimum acceptable protocol was approved and then optimized at each site due to scanner differences (see Additional file 1). The images were checked for quality by an experienced rater. Volumetric measurements were processed on 3D T1-weighted (3DT1) images with Statistical Parametric Mapping 12 (SPM12) software (Wellcome Trust Centre for Neuroimaging, University College London, UK) running in MATLAB 2011a (MathWorks Inc., Natick, MA, USA). Prior to processing, the origin in each scan was manually set to the anterior commissure. Scans were segmented into gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF). Total GM (i.e., the sum of all GM voxels) and total intracranial volume (TIV) (i.e., the sum of GM, WM, and CSF volumes) were derived from the segmented images in native space (units in liter). Cortical volume was defined as the total GM volume normalized for head size, divided by TIV.
Procedures
Study visits took place at the hospital or the participant’s home, depending on the participant’s preference. A trained rater assessed the cognitive tests according to standardized instructions, starting with the MMSE and followed by the cognitive part of the CFC (word recognition, orientation, CFT, COWAT, DSST, DSB, word recall) and the remaining ADAS-Cog-13 tests. In the meantime, the study partner completed the A-IADL-Q, ZBI, and QoL-AD independently on an iPad. Subsequently, the participant completed the QoL-AD on the iPad with assistance from the rater. Finally, the rater completed the ADCS-ADL and CDR interview with the study partner. The total duration of a complete assessment was approximately 90 min. A shortened protocol was used in the SCD and DLB participants, as it was not our purpose to compare the CFC to the traditional tests that were not designed to assess the progression in these groups. Therefore, SCD and DLB participants only underwent the MMSE and cognitive battery of the CFC whilst their study partner completed the A-IADL-Q.
MRI procedures
MR scans acquired less than 6 months prior to the study visit were available for a subset of the study cohort. These included at least 3D T1- and T2-weighted imaging (T2) and 3D fluid-attenuated inversion recovery (FLAIR). Participants without a recent MRI scan available but who agreed to undergo a structural MRI scan were also scanned at 3 T with the same structural sequences which took about 30 min.
Statistical analyses
Statistical analyses were performed using SPSS version 22.0 (IBM Corp., Armonk, NY) and R Studio (R Core Team, 2018). Statistical significance level was set at p value < .05, unless otherwise indicated. Demographic and clinical differences between the groups were investigated using chi-square tests, one-way analyses of variance (ANOVA) followed by Hochberg’s post hoc tests, and independent t tests for measures only available for the MCI and AD group.
Construct validity and clinical relevance
We performed confirmatory factor analyses (CFA) including all CFC subtests to investigate the CFC’s underlying factor structure. We evaluated a single-factor, two-factor (memory and EF), and three-factor (memory, EF, and IADL) model. In the two-factor model, the memory factor included the word recognition, orientation, and word recall tests and the EF factor included the CFT, COWAT, DSST, DSB, and A-IADL-Q. The three-factor model had a similar memory and EF factor, except that the A-IADL-Q was excluded from the EF factor and included a separate factor. We compared these models using chi-square tests and by evaluating their Comparative Fit Index (CFI), root mean squares of error approximation (RMSEA), and standardized root mean square residual (SRMR) indices, with CFI ≥ .90, RMSEA < .08, and SRMR < .08 considered as adequate fit [39]. We hypothesized that the three-factor model would fit, based on preparatory work on the cognitive component showing two underlying factors [40], and A-IADL-Q reflecting one underlying factor [19]. As a sensitivity analysis, all aforementioned CFA model evaluations were repeated in a restricted sample of MCI and mild AD participants, as this was the primary target population of the CFC.
Next, we investigated the differences in CFC scores across diagnostic groups using ANOVA followed by Hochberg’s post hoc tests, to examine whether scores would decrease from SCD to dementia. We assessed the association between the CFC variables and reference measures of disease severity, by performing linear regression analyses for each reference measure (CFI-SP, QoL-AD, ZBI-12, and AES, as dependent) and CFC score, age, sex and education as independents. To evaluate the added clinical value of the A-IADL-Q, we also investigated a second model including the cognitive component score and A-IADL-Q score as separate independents. The association between CFC score and gray matter volume was assessed with a linear regression analysis correcting for age, sex, years of education, and scanner type. We computed Pearson’s correlation coefficients to investigate the relation between CFC scores and traditional cognitive and functional measures.
Quality for the target population
As the CFC was initially designed for MCI and mild AD dementia, the comparison analyses with traditional measures and ADCOMS were performed using the MCI and AD groups only. Using this sample, we compared histograms of score distributions of the CFC, traditional tests, and ADCOMS to inspect range restrictions in scoring. To allow for appropriate comparisons between the CFC components and traditional tests, the histograms for the ADAS-Cog, ADCS-ADL, and CDR-SB score distributions were based on the standardized scores. Additionally, we reported original score ranges and distribution parameters (percentiles, skewness, and kurtosis) for all tests.