Skip to main content

Comparison and aggregation of event sequences across ten cohorts to describe the consensus biomarker evolution in Alzheimer’s disease



Previous models of Alzheimer’s disease (AD) progression were primarily hypothetical or based on data originating from single cohort studies. However, cohort datasets are subject to specific inclusion and exclusion criteria that influence the signals observed in their collected data. Furthermore, each study measures only a subset of AD-relevant variables. To gain a comprehensive understanding of AD progression, the heterogeneity and robustness of estimated progression patterns must be understood, and complementary information contained in cohort datasets be leveraged.


We compared ten event-based models that we fit to ten independent AD cohort datasets. Additionally, we designed and applied a novel rank aggregation algorithm that combines partially overlapping, individual event sequences into a meta-sequence containing the complementary information from each cohort.


We observed overall consistency across the ten event-based model sequences (average pairwise Kendall’s tau correlation coefficient of 0.69 ± 0.28), despite variance in the positioning of mainly imaging variables. The changes described in the aggregated meta-sequence are broadly consistent with the current understanding of AD progression, starting with cerebrospinal fluid amyloid beta, followed by tauopathy, memory impairment, FDG-PET, and ultimately brain deterioration and impairment of visual memory.


Overall, the event-based models demonstrated similar and robust disease cascades across independent AD cohorts. Aggregation of data-driven results can combine complementary strengths and information of patient-level datasets. Accordingly, the derived meta-sequence draws a more complete picture of AD pathology compared to models relying on single cohorts.


Alzheimer’s disease, in combination with its clinical manifestation/syndrome (AD) [1], is a progressive, multifaceted disease whose cognitive symptoms surface years after disease onset [2]. In order to identify crucial opportunities for medical interventions that could potentially prevent or delay symptoms, it is vital to understand the temporal relationship of pathological changes underlying the progressive nature of AD. To this end, cognitive assessments and a wide range of biomarkers, including cerebrospinal fluid (CSF) markers and neuroimaging-derived measures, have been established to monitor the disease’s progression. Measuring these markers enables the observation of biochemical, structural, functional, and cognitive changes that occur as the disease progresses [3] and the resulting data can build the basis for data-driven approaches that aim to determine the relative temporal dependencies between biomarkers and cognitive symptoms [4]. Previously, a variety of data-driven models have been developed with the aim of accomplishing this task [5,6,7,8,9,10].

One model archetype that has found wide success in the context of neurodegenerative diseases [11,12,13,14] and AD specifically [15] is the event-based model (EBM) [13]. It is a data-driven probabilistic generative model that characterizes the progression of a disease in the form of a single sequence of events which describes the relative order of measured markers turning from a normal state to a diseased state (i.e., abnormal state). Such event sequences carry the benefit that they are highly interpretable and, although describing disease progression, can already be learned from cross-sectional cohort study data. Previously, EBMs have been used to derive event sequences [13], stage subjects in their disease progression [15], predict conversion from one clinical stage to the other (i.e., cognitively unimpaired (CU) to mild cognitive impairment (MCI), or MCI to AD) [16], and uncover disease phenotypes with distinct temporal progression patterns.

To build an EBM, patient-level data are needed on which the model can be fit. In recent decades, an increasing number of observational cohort studies have released their collected data for research purposes, including the Alzheimer’s Disease Neuroimaging Initiative (ADNI) [17], the European Prevention of Alzheimer’s Dementia (EPAD) [18], and AddNeuroMed [4]. So far, however, only a few studies in the AD domain have applied EBMs to data from other cohorts besides ADNI [19, 20]. Previous work evaluating data-driven progression modeling based on cohort datasets has shown that the participant recruitment procedures can introduce cohort-specific systematic statistical biases into the collected data [21], which, in turn, can bias the estimation of disease progression [22]. Therefore, it is necessary to replicate and validate data-driven results in independent cohorts to ensure robust conclusions. Consequently, it remains unclear whether event sequences determined from one cohort dataset would generalize beyond the discovery cohort itself and, further, if sequences generated across several cohorts were concordant among each other. Simultaneously, gaining a comprehensive event sequence combining all relevant AD biomarkers, cognitive assessments, and functional scores is infeasible, since cohort studies can only measure a limited set of variables that are often only partially overlapping between them [23]. In theory, however, this allows for an estimation of individual event sequences from distinct cohorts which cover complementary sets of markers. Aggregating results across cohorts would harness this complementary information by assembling a meta-sequence that provides a more complete picture of the development and progression of AD.

In this work, we present a systematic, in-depth comparison of AD event sequences derived from ten independent landmark cohort studies to investigate the generalizability and robustness of EBM-derived AD progression patterns. Furthermore, we designed a novel rank aggregation algorithm which we used to aggregate the event sequences into a single meta-sequence, thereby fusing the complementary information in all variables assessed across the studies. Our work harnesses the heterogeneity in cohort study designs and measurements to produce a meta-sequence providing a more complete, and robust, picture of the temporal order of pathological marker changes in AD progression.


Investigated cohort datasets

We selected ten independent AD cohort studies for our analysis by systematically exploring suitable datasets using the ADataViewer [23]. The prerequisite for including a cohort into our analysis was that (1) diagnostic staging into CU, MCI, and AD was performed [24]; (2) cross-sectional data was available for at least 10 patients per diagnostic group; and (3) multiple data modalities were collected. The cohorts that were ultimately selected are presented in Table 1. All cohorts followed the NINCDS-ADRDA diagnostic criteria [24].

Table 1 Selected cohorts, their number of participants per disease stage, and their number of considered variables

Variable selection

We aimed at including a wide spectrum of variables to uncover the temporal relationship across multimodal markers of AD pathology that capture, for example, different biochemical, cognitive, or structural changes. In order to include a specific variable, it must have been measured in at least the CU and AD groups of the respective study to allow for later modeling. Furthermore, only a minimal amount of missing values was tolerable, as participants with missing values in any of the ultimately selected variables had to be excluded from the analysis. This led to a trade-off between the inclusion of an increasing number of variables and the amount of participants available for analysis. We present an example of variable inclusion and the effect on sample size in the supplementary material (Table S1). In total, 36 unique variables were selected from different data modalities covering neuropsychological and cognitive tests, CSF markers, and MRI-derived brain region volumes. The complete list of selected biomarkers and their corresponding modality are presented in Table 2. The number of variables per cohort is given in Table 1.

Table 2 The selected biomarkers and their corresponding abbreviations


An available diagnosis of a participant as either CU, MCI, or AD was a prerequisite for inclusion. Furthermore, any participant with a diagnosis of cognitive impairment that was not linked to AD by the respective study’s clinicians was excluded. Furthermore, only participants with complete data across all selected biomarkers could be used in our modeling approach. The number of participants per cohort and diagnostic group is described in Table 1.

Progression modeling via event-based models

The EBM derives a probabilistic sequence from patient-level data that describes the temporal order in which measured values of variables turn from a normal to an abnormal state. Each of these transitions is called an event. In this context, normality or abnormality are defined non-parametrically using kernel density estimation mixture modeling on the empirical values of the modeled cohort’s CU and AD populations, respectively [35]. This probabilistic allocation of measurements into two groups allows study participants (in particular, patients) to have a mix of occurred and non-occurred events across all measurements which lays the foundation to estimate the most likely event sequence. Here, the EBM assumes that the biomarkers monotonically change towards abnormality as the disease progresses and that this process is irreversible. Furthermore, there are no a priori assumptions regarding predefined disease stages, cut points determining the abnormality of biomarkers, or the temporal relationship between them. The most likely sequence of events S is then estimated by maximizing the likelihood (𝑋|𝑆) (Eq. 1), where variable measurements are denoted by xX for iM markers and jN indicates the individual samples.

$$\mathit{\Pr}\left(X|S\right)=\prod \limits_{j=1}^N\left[\sum \limits_{m=0}^M\left\{\prod \limits_{i=1}^m\mathit{\Pr}\left({x}_{ij}|{E}_i\right)\prod \limits_{i=m+1}^M\mathit{\Pr}\Big({x}_{ij}|\neg {E}_i\Big)\right\}\right]$$

Here, Pr(xij| Ei) and Pr(xij  ¬ Ei) describe the probability of observing the value of x given that the event Ei (i.e., variable x turning abnormal) has, or has not, occurred, respectively. For more details, we refer to the Supplementary Material and the original publication of the KDE EBM by Firth et al. [35]. The derived mixture models per cohort and measurement are presented in Fig. S3.

To quantify the similarity of distinct event sequences, we calculated the pairwise Kendall’s tau rank correlation coefficient (KTC) across sequences and the Bhattacharrya coefficient (BC) for specific events as explained in Oxtoby et al. [12]. The KTCs were calculated pairwise across all cohorts while considering only the relative ranks of variables which were common among the respective two cohorts’ sequences. An average KTC that is close to 1 and shows low standard deviation across the cohorts would indicate high concordance. An average BC close to 1 implies high similarity in the positional variance of ranks while the BC amounts to 0 for completely different patterns.

Generating a meta-sequence based on event sequences derived from multiple cohort studies

To generate a meta-sequence, we propose a method that combines individual event sequences (called “base sequences”) stemming from independent datasets. We assemble a meta-sequence in a two-step procedure: first, building on the ideas presented in [36] and [37], we generate all possible sequences comprising k variables that are randomly drawn from the union of variables encountered in the base sequences (with k < total number of variables). The generated sequence with the minimum average distance to all base sequences is selected as a starting sequence for the next step. In step 2, this starting sequence is extended by iteratively adding the remaining variables to it (i.e., those not in the k variables of the starting sequence), such that the average distance between the altered sequence and all base sequences remains minimal. Here, the new variable is not necessarily added to the end of the sequence but all possible positions are considered. This process is repeated until all variables have been included into the sequence which finally forms the aggregated meta-sequence. Therefore, the algorithm is deterministic once the base sequences are calculated. Splitting the algorithm in two steps (an exhaustive search for the first k variables followed by the greedy insertions) was necessary, as the search space (i.e., all possible meta-sequences) grows exponentially with the number of variables in the base sequences. Further explanations about the algorithm, the handling of partially overlapping lists, and access to the corresponding python code are provided in the Supplementary Material and Fig. S1.

We designed and applied two algorithms for generating a meta-sequence: one based on the maximum likelihood (ML) sequences presented by EBMs and one relying on bootstrapping. In the former, only the ML base sequences of each cohort were used as an input to our algorithm. Therefore, however, solely the rank of each event is considered while its positional variance within a sequence is not taken into account.

During the bootstrapping approach, all base sequences are resampled b-times with replacement. This means that a new base sequence is generated per cohort based on a sample of that cohort’s participants that was randomly drawn with replacement and is of equal size to the original cohort. For each of these b sets of base sequences, one meta-sequence is generated. The resulting consensus over the b meta-sequences is visualized using a positional variance diagram which displays the variation in event ranks exhibited across the generated meta-sequences.

For this work, we generated a meta-sequence considering only variables which were present in at least three cohorts (Table 2) and set k equal to eight. In our bootstrapped version, we drew 500 bootstrap samples. The distance metric chosen was Spearman’s footrule distance which takes the absolute difference in positions of variables into account.

Patient staging according to the determined meta-sequence

Once a meta-sequence was determined, one possible way to evaluate its plausibility across cohorts was to evaluate the assignment of subjects of the respective cohorts to the disease stages defined by the meta-sequence. In this process, each participant of a study was assigned to a disease stage which represents the current step in the meta-sequence at which the participant most likely resides. Therefore, stage 0 refers to the absence of any abnormal markers, while the farthest progressed stage m (with m being equal to the length of the sequence) implies that all events occurred for that particular subject. The corresponding equation underlying the patient staging is provided in the Supplemental Material.

Here, we staged only participants from cohorts that contained measurements of all investigated modalities (i.e., ADNI, JADNI, EMIF, and NACC) and were bound to consider only those variables of the meta-sequence that were found in the respectively staged cohort.


Comparing event sequences derived from multiple cohort studies

We observed broad consistency with respect to the position of events across all cohorts’ sequences which resulted in an average KTC of 0.69 ± 0.28 (pairwise KTCs are presented in Table S4; sequence similarity is also indicated visually through an approximately diagonal line of the event ranks from top-left to bottom-right in Fig. 1). In most cohorts’ sequences, CSF markers ranked highly, before cognitive impairments, which were again followed by MRI-derived brain volumes in the lower ranks.

Fig. 1
figure 1

Individual event sequences estimated from the ten investigated cohorts. To facilitate the comparison of relative event positions, the y-axes follow the ADNI sequence. Common events between ADNI and the other cohorts are presented above a dashed green line. The closer the sequences are to the ADNI sequence, the more diagonal the probabilistic position (colored squares) will align from top-left to bottom-right. Lateral shifts due to additional events which were not available in ADNI have to be disregarded (as for example observed in WMHAD and EDSD). Event order 1 corresponds to the first position in the sequence. The shading of squares indicates the positional probability with darker shades corresponding to higher probabilities. The relative sizes of the squares do not encode any information. The event sequences in their original form are presented in Fig. S2

The relative order among clinical assessments measuring different cognitive domains (e.g., memory, language, visuospatial, executive) was consistent across most cohorts (see Table S2 for a mapping of tests to cognitive domains). The cognitive impairment in all investigated cohorts started with memory dysfunction detected by logical memory tests (e.g., LDEL and LIMM), proceeded with language impairments exposed by tests such as the BNT and CATFLU. Thereafter, in most cohorts, visual dysfunction identified through the CLKS or FIGC followed, and finally, executive dysfunction recognized by, for example, the DIGIT and WAIS, occurred.

Among the cohorts where CSF biomarkers had been measured (ADNI, JADNI, EMIF, NACC), the relative positions of these biomarkers, in particular of tau (TAU) and phosphorylated tau (PTAU), varied. ABETA consistently placed first in all of these cohorts’ sequences, and TAU and PTAU were mainly found in early positions as well (ADNI, JADNI, and EMIF), with the exception of NACC where they placed in the middle of the sequence. However, in all cases except JADNI, PTAU and TAU were direct neighbors, indicating the consistent, direct link between them.

The relative order of the MRI-derived brain volume events was consistent across cohorts, albeit with some variance (average KTC of 0.64 ± 0.29 for MRI variables only). While the volume changes in ADNI, JADNI, ARWIBO, and WMHAD started with ventricular expansion and were then followed by atrophy of the temporal lobe (here, hippocampus, entorhinal, middle temporal, and fusiform gyrus), in other cohorts (ANM, OASIS, NACC, EDSD), atrophy of the temporal lobe regions was the first detected variables of the MRI modality. The position that was taken by each respective brain region varied again among the cohorts. However, in many cases, the probabilistic nature of the EBMs indicated that the order of MRI events could be interchangeable among themselves (average BC of 0.17 ± 0.13 for MRI variables only) and events occurred most probably in close temporal proximity or even simultaneously (Fig. S2), as far as the model could discern from the data.

The position of FDG-PET, another well-established imaging biomarker measuring brain hypometabolism, was consistent in both cohorts it was measured in (ADNI, JADNI). It preceded the MRI marker changes and occurred concurrently with clinical symptoms, being placed after logical memory tests such as the LIMM and LDEL. However, its positioning of FDG-PET related to assessments of executive function differed between the two cohorts.

A multimodal meta-sequence of AD progression

To aggregate and investigate the complementary information from the base sequences in each cohort, we combined them into a single meta-sequence. Here, the position of a variable was determined based on its relative positions in all cohort sequences. Both versions of our algorithm (i.e., ML sequence-based and bootstrapping) were applied.

In the meta-sequence generated based on each cohort’s ML sequence (Fig. 2), ABETA was ranked first, followed by PTAU and TAU. The latter were again closely linked and seemingly interchangeable given their ambiguous positioning across the base sequences. In positions four and five, LDEL and LIMM followed respectively, two clinical assessments measuring memory impairment. Next, the volume of CSF in the brain was positioned in the meta-sequence. The later event ranks were covered by MRI markers of brain volume, starting with the temporal lobe (e.g., hippocampus and entorhinal cortex) and ending with the ventricles. The previously described ambiguity in the order of MRI regions is not reflected in the ML-based meta-sequence because the algorithm considers only the ranks, and not the uncertainty estimated by the individual EBMs. However, it seems sensible to consider MRI events as fairly interchangeable in the meta-model. FIGC, an assessment of visual function, positioned before FUSIF and MIDTEMP near the end of the sequence, yet its position with respect to those two variables remained rather indefinite across the base sequences in which it was assessed (ARWIBO, AIBL, EDSD).

Fig. 2
figure 2

All ML base sequences from the ten investigated cohorts and the resulting meta-sequence. Due to only partially overlapping lists, the determining factor for an event’s position in the meta-sequence was not its absolute position in each base sequence (i.e., rank 1, 2, …, 11), but its relative position to other biomarkers in the same sequence (e.g., ABETA commonly places before MMSE when they were assessed together; thus, it appears before MMSE in the meta-sequence)

The consensus meta-sequence generated using the bootstrapping approach resembled the ML meta-sequence closely (KTC between both meta-sequences: 0.79; Fig. 3). Again, CSF markers placed first in the meta-sequence, were followed by cognitive assessments, and MRI events started with the temporal lobe and further progressed with the ventricles. The main difference to the ML-based meta-sequence, as well as the major region of model uncertainty, was again found among the MRI variables. This further underlined the impression that the MRI events were fairly interchangeable and probably occurred in close temporal proximity. The highest ambiguity was in the positioning of FIGC which showed a slight tendency towards the last ranks. The average KTC across all bootstrapped meta-sequences was 0.5 ± 0.20, with the highest discordance found among the MRI modality.

Fig. 3
figure 3

Bootstrapped meta-sequence generated from 500 samples of the base sequences of the 10 cohorts. Event order 1 corresponds to the first position in the sequence. The shading of squares indicates the positional probability with darker shades corresponding to higher probabilities

Staging the patients of cohorts with available CSF, MRI, and cognitive scores (i.e., ADNI, JADI, NACC, EMIF) revealed a consistent pattern across them (Fig. 4). For all cohorts, the vast majority of CU subjects were assigned to the first stage which corresponds to no event occurrences. As expected, MCI patients were largely staged between CU subjects and AD patients with some overlap in both directions. This suggests that these subjects experienced CSF marker abnormalities and some cognitive symptoms. Finally, the majority of AD patients were assigned to the last stages, indicating their abnormality along CSF markers, cognitive performance, and brain region atrophy.

Fig. 4
figure 4

Number of subjects from each diagnostic group per meta-sequence stage. Each step along the x-axis corresponds to the occurrence of a new biomarker abnormality event. Stage 0 corresponds to no event occurrence while the last stage implies abnormality of all variables. Events are ordered according to the bootstrapped meta-sequence, always considering only variables in common between the measurements available in the respective cohort and the meta-sequence


In this work, we used EBMs to investigate AD progression across ten independent cohort studies by evaluating the concurrence of their individually derived event sequences. Furthermore, we proposed an algorithm to combine event sequences estimated from partially overlapping, and thus complementary, sets of variables into a single meta-sequence describing AD progression more comprehensively. Finally, we applied said algorithm on the ten event sequences to estimate a meta-sequence comprising 13 AD variables spanning CSF biomarkers, MRI measures, and clinical assessments of cognitive and functional performance.

Consistent trends across cohorts’ event sequences

The derived event sequences proved to be broadly consistent across cohorts, with the most notable variability in the ordering of MRI brain volume events. This could be caused by (1) distinct statistical biases of the cohorts for example introduced through specific recruitment criteria [21], (2) distinct prevalence of AD disease progression subtypes that follow different disease mechanisms [38,39,40], or (3) mixed neuropathologies.

Inclusion and exclusion criteria of a study shape the demographic compositions of its cohort and thus can directly affect the data-driven disease progression patterns (Table S3). For instance, ADNI held a higher proportion of APOE4 carriers compared to JADNI. Given that it has been repeatedly reported that early TAU depositioning is more prominent in APOE4 carriers [41,42,43], this difference might explain the earlier positioning of TAU in ADNI’s sequence opposed to its relatively lower rank in JADNI’s.

Previously, for example, two empirically determined AD progression subtypes called “hippocampal-sparing” and “limbic-predominant” were described and associated with distinct patterns of brain atrophy [38, 44]. While structural changes in the brain start with atrophy in the medial temporal lobe (e.g., entorhinal and hippocampus) for the limbic-predominant subtype, the brain deterioration in the hippocampal-sparing subtype begins with atrophy of the frontal cortex and with the enlargement of ventricles [44]. Given their respective event sequences, this could indicate that OASIS, ADNI, and NACC might have included more patients expressing the limbic-predominant subtype, while the hippocampal-sparing subtype was more dominant among patients from ARWIBO and JADNI.

We observed that CSF biomarkers placed first in all cohorts which measured them. This finding is in concordance with previous biomarker studies that observed the occurrence of both ABETA accumulation and brain atrophy before global cognitive decline [45,46,47,48].

Autopsies of AD patients have shown that AD pathology hardly appears in isolation and that patients often suffer from a mixture of brain pathologies [49]. While most studies aim to exclude patients affected by other cognitive diseases, an AD clinical diagnosis is still mainly symptom driven and misclassification errors are possible.

Meta-sequence combines heterogeneous event sequences from multiple cohorts

A particular strength of our meta-sequence algorithm is that it works agnostic towards the differences in variable value representations exhibited across cohorts. A direct comparison of the provided data values often remains challenging without introducing statistical biases since studies differ, for example, in their data collection procedures, employed imaging machinery, and used assays. Using our approach, such semantically equivalent but statistically heterogeneous information can be combined as all computations are performed solely on the base sequences and thus potential across-cohort-biases due to value representations are avoided.

The biggest advantage of the bootstrapping approach compared to ML sequence-based one is that it allows for uncertainty quantification. However, bootstrapped EBM sequences tend to display a substantially higher positional variance (i.e., “fuzziness”) than ML derived ones (for an example, see Firth et al. Figures 1 and 2 [35]). Comparing our ML-based meta-sequence to the bootstrapping-based meta-sequence revealed high similarity between them. Observed differences seemed to be within variational limits expressed in the bootstrapped meta-sequence and mainly affected MRI variables.

Generated meta-sequence resembles AD pathology

One possibility to validate the derived meta-sequence was to evaluate its concordance with previous findings describing the temporal relationship between smaller subgroups of variables.

The ordering of CSF biomarkers discovered in previous EBM studies supported our observations in the meta-sequence (ABETA followed by PTAU and TAU) [15]. Our findings were also in line with a recent study [50] which demonstrated that TAU and PTAU become abnormal after ABETA and that their abnormality occurred in close temporal relationship with cognitive decline. The latter was also in concordance with our findings; however, the cognitive assessments we investigated (i.e., LDEL and LIMM) were not directly included in the referenced study. Furthermore, there is a well-established association between cognitive decline and ABETA abnormality and abundant evidence that changes in cognition typically occur after abnormalities related to CSF biomarkers [45, 50, 51].

Our observation that memory function showed abnormality before brain volumes agrees with previous studies which suggested that individual-level brain atrophy rates (not assessed in our study) precede cognitive events; however, MRI-derived brain volumes become abnormal afterwards [15].

In our meta-sequences, changes in MRI biomarkers were ranked after cognitive decline. In agreement with this, for example, Hadjichrysanthou et al. reported that changes in MRI markers appear in close succession with memory decline [52]. Also, the positioning of MRI variables with respect to CSF markers was concordant with previous observations where significant correlations between CSF biomarkers and temporal lobe atrophy were found [53,54,55]. These studies argue that increases of TAU and PTAU are attributable to the deposition of neurofibrillary tangles in the temporal lobe, including the hippocampus and entorhinal cortex, which we found to be the first brain region volumes turning abnormal. Furthermore, elevated CSF biomarkers predicted future brain atrophy in these regions (i.e., CSF biomarkers became abnormal before brain volumes).

In concordance with the relative positioning of MRI biomarkers in the meta-sequence, various studies have shown that volumetric changes start with the temporal lobe areas, including the hippocampus which preceded the abnormality of the entorhinal cortex, fusiform, and middle temporal, and further proceed to other brain regions such as the ventricles [56,57,58,59].

Finally, in agreement with a previous study [60,61,62,63] in which visual memory dysfunction was identified as one of the last stages in AD progression, the FIGC test was ranked among the end of the sequences. The fact that it was positioned after the enlargement of ventricles is in agreement with experimental evidence that changes in the ventricles may precede a deficit in visual memory function [64, 65]. Another EBM study [35] also suggested that visual processing becomes impaired after episodic memory in typical AD.

The conducted patient staging provided further evidence that the generated meta-sequence described a sensible cascade of AD progression: participants from the three diagnostic groups were distributed according to their disease severity with CU subjects being staged first, MCI patients spreading around the intermediate stages, and AD cases occupying the later stages of the sequence. Observing MCI subjects at stage 0 could be explained by CSF biomarker values and cognitive scores that were close to the probabilistic event threshold but did not yet exceed it and, consequently, the model considered them to be normal. The few AD cases that were staged early in the sequence were amyloid-negative subjects which potentially indicated their misclassification.


To build a robust meta-sequence, each variable had to be present in at least some of the base sequences to allow for meaningful distance calculations. Furthermore, the high amounts of missing data occurring when multiple data modalities are combined led to a substantial decrease of the number of available participants per study. This could have led to more noise in the EBM’s reference distributions. Additionally, modeling signals from heterogeneous data sources, such as AD cohort data, as some form of average bears the potential risk that the resulting average will resemble a rather artificial construct that cannot be observed in its specific form in the real world. However, the similarity among the base sequences as well as between base sequences and the final meta-sequence was quite high and our identified meta-sequences were highly concordant with results from both data-driven and experimental studies. Furthermore, the patient staging along the meta-sequence displayed a sensible distribution of CU, MCI, and AD subjects along the disease stages. Consequently, it is improbable that the presented meta-sequence represents such an artificial average. Finally, we want to highlight again that AD was considered primarily from a clinical perspective in all of our investigated cohort studies. As such, there is a chance that misdiagnosed patients were present in the cohorts and therefore included in this analysis as well.


In the light of the reproducibility crisis, it becomes especially important that we look beyond single data resources, validate achieved results across multiple cohort studies, and constantly develop and evaluate data-driven methods. To this end, we revealed general consistency across data-driven event sequences derived from ten independent cohorts using EBMs. Here, only relatively minor differences in the ranking of the core features that were available in all ten cohorts were observed. In addition, our novel algorithm estimated a meta-sequence that exploits the additional information available in other variables unique to each study and thus could assemble an event sequence that is highly multimodal and more comprehensive than sequences built from single datasets. This is important for ensuring the transferability of models and results across AD (sub)populations and for improving our understanding of disease progression.

Availability of data and materials

De-identified data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) (, the Australian Imaging, Biomarker and Lifestyle Flagship Study of Ageing (AIBL) database (, the European Collaboration for the Discovery of Novel Biomarkers for Alzheimer’s Disease (AddNeuroMed) (!Synapse:syn4988768), Alzheimer’s Disease Repository Without Borders (ARWIBO) (, Open Access Series of Imaging Studies (OASIS) (, White Matter Hyperintensities in Alzheimer’s Disease (WMH-AD) (, European Diffusion Tensor Imaging Study in Dementia (EDSD) (, National Alzheimer’s Coordinating Center (NACC) (, Japanese Alzheimer’s Disease Neuroimaging Initiative (JADNI) (, European Medical Information Framework for Alzheimer’s Disease Multimodal Biomarker Discovery (EMIF-AD MBD) (; The authors had no special access privileges others would not have to the data obtained from these resources.



The Alzheimer’s Disease Neuroimaging Initiative


Japanese Alzheimer’s Disease Neuroimaging Initiative


The Australian Imaging, Biomarker Lifestyle Flagship Study of Ageing


The National Alzheimer’s Coordinating Center




European Medical Information Framework


European DTI Study on Dementia


Alzheimer’s Disease Repository Without Borders


Open Access Series of Imaging Studies


White Matter Hyperintensities in Alzheimer’s Disease


Clinical Dementia Rating Sum of Boxes


Neuropsychiatric Inventory


Logical Memory - Delayed Recall Total Number of Story Units Recalled


Alzheimer’s Disease Assessment Scale (13-items)


Alzheimer’s Disease Assessment Scale (11-items)


Mini-Mental State Examination


Logical Memory - Immediate Recall Total Number of Story Units Recalled


Trail Making Test-B


Digit-Symbol Coding Test


California Verbal Learning Test Delayed Raw Score


Category Fluency (animals - fruits/vegetables)


Figure Copy


California Verbal Learning Test Recall Raw Score


Figure recall


C/D Stroop Test Raw


Short-Term Memory




Perceptual Orientation


Mental Manipulation




Clock Drawing Test Total Score


Executive Memory


Word List Learning Trial


Boston Naming Test Score


Digit Symbol Substitution Test




Total Tau


Phosphorylated Tau (p-Tau)


Entorhinal volume


Hippocampal volume


Fusiform volume


Ventricles volume


Middle temporal volume


Accumulated CSF in the brain


Fluorodeoxyglucose positron emission tomography (FDG PET)


Magnetic resonance imaging


Mild cognitive impairment


Alzheimer’s disease


Cognitive unimpaired


Kendall’s tau rank correlations


Event-based model


Cerebrospinal fluid


  1. Jack CR Jr, Bennett DA, Blennow K, Carrillo MC, Dunn B, Haeberlein SB, et al. NIA-AA research framework: toward a biological definition of Alzheimer’s disease. Alzheimers Dement. 2018;14(4):535–62.

    Article  PubMed  PubMed Central  Google Scholar 

  2. DeTure MA, Dickson DW. The neuropathological diagnosis of Alzheimer’s disease. Mol Neurodegen. 2019;14(1):32.

    Article  Google Scholar 

  3. Blennow K, Zetterberg H. Biomarkers for Alzheimer’s disease: current status and prospects for the future. Intern Med. 2018;284(6):643–63.

    Article  CAS  Google Scholar 

  4. Lovestone S, Francis P, Kloszewska I, Mecocci P, Simmons A, Soininen H, et al. AddNeuroMed--the European collaboration for the discovery of novel biomarkers for Alzheimer’s disease. Ann N Y Acad Sci. 2009;1180:36–46.

    Article  CAS  PubMed  Google Scholar 

  5. Lorenzi M, Filippone M, Frisoni GB, Alexander DC, Ourselin S. Alzheimer’s Disease Neuroimaging Initiative. Probabilistic disease progression modeling to characterize diagnostic uncertainty: application to staging and prediction in Alzheimer’s disease. NeuroImage. 2019;190:56–68.

    Article  PubMed  Google Scholar 

  6. Jedynak BM, Lang A, Liu B, Katz E, Zhang Y, Wyman BT, et al. A computational neurodegenerative disease progression score: method and results with the Alzheimer’s disease Neuroimaging Initiative cohort. NeuroImage. 2012;63(3):1478–86.

    Article  PubMed  Google Scholar 

  7. Yang E, Farnum M, Lobanov V, Schultz T, Verbeeck R, Raghavan N, et al. Alzheimer’s Disease Neuroimaging Initiative. Quantifying the pathophysiological timeline of Alzheimer’s disease. J Alzheimers Dis. 2011;26(4):745–53.

    Article  PubMed  Google Scholar 

  8. Delor I, Charoin JE, Gieschke R, Retout S, Jacqmin P. Modeling Alzheimer’s disease progression using disease onset time and disease trajectory concepts applied to CDR-SOB scores from ADNI. CPT Pharmacometrics Syst Pharmacol. 2013;2(10):e78.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Villemagne VL, Burnham S, Bourgeat P, Brown B, Ellis KA, Salvado O, et al. Amyloid β deposition, neurodegeneration, and cognitive decline in sporadic Alzheimer’s disease: a prospective cohort study. Lancet Neurol. 2013;12(4):357–67.

    Article  CAS  PubMed  Google Scholar 

  10. Donohue MC, Jacqmin-Gadda H, Le Goff M, Thomas RG, Raman R, Gamst A, et al. Estimating long-term multivariate progression from short-term data. Alzheimers Dement. 2014;10(5 Suppl):S400–10.

    Article  PubMed  PubMed Central  Google Scholar 

  11. Dekker I, Schoonheim MM, Venkatraghavan V, Eijlers A, Brouwer I, Bron EE, et al. The sequence of structural, functional and cognitive changes in multiple sclerosis. NeuroImage Clin. 2021;29:102550.

    Article  PubMed  Google Scholar 

  12. Oxtoby NP, Leyland LA, Aksman LM, Thomas G, Bunting EL, Wijeratne P, et al. Sequence of clinical and neurodegeneration events in Parkinson’s disease progression. Brain. 2021;144(3):975–88.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Fonteijn HM, Modat M, Clarkson MJ, Barnes J, Lehmann M, Hobbs NZ, et al. An event-based model for disease progression and its application in familial Alzheimer’s disease and Huntington's disease. NeuroImage. 2012;60(3):1880–9.

    Article  PubMed  Google Scholar 

  14. Wijeratne PA, Young AL, Oxtoby NP, Marinescu RV, Firth NC, Johnson E, et al. An image-based model of brain volume biomarker changes in Huntington’s disease. Ann Clin Transl Neurol. 2018;5(5):570–82.

    Article  PubMed  PubMed Central  Google Scholar 

  15. Young AL, Oxtoby NP, Daga P, Cash DM, Fox NC, Ourselin S, et al. A data-driven model of biomarker changes in sporadic Alzheimer’s disease. Brain. 2014;137(Pt 9):2564–77.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Young AL, Marinescu RV, Oxtoby NP, Bocchetta M, Yong K, Firth NC, et al. Uncovering the heterogeneity and temporal complexity of neurodegenerative diseases with subtype and stage inference. Nat Commun. 2018;9(1):4273.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Mueller SG, Weiner MW, Thal LJ, Petersen RC, Jack CR, Jagust W, et al. Ways toward an early diagnosis in Alzheimer’s disease: the Alzheimer’s Disease Neuroimaging Initiative (ADNI). Alzheimers Dement. 2005;1(1):55–66.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Solomon A, Kivipelto M, Molinuevo JL, Tom B, Ritchie CW. European prevention of Alzheimer’s dementia longitudinal cohort study (EPAD LCS): study protocol. Prev Alzheimers Dis. 2018;8(12):e021017.

    Google Scholar 

  19. Oxtoby NP, Young AL, Cash DM, Benzinger T, Fagan AM, Morris JC, et al. Data-driven models of dominantly-inherited Alzheimer’s disease progression. Brain. 2018;141(5):1529–44.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Archetti D, Ingala S, Venkatraghavan V, Wottschel V, Young AL, Bellio M, et al. Multi-study validation of data-driven disease progression models to characterize evolution of biomarkers in Alzheimer’s disease. NeuroImage. 2019;24:101954.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Birkenbihl C, Salimi Y, Fröhlich H, Japanese Alzheimer’s Disease Neuroimaging Initiative, Alzheimer’s Disease Neuroimaging Initiative. Unraveling the heterogeneity in Alzheimer’s disease progression across multiple cohorts and the implications for data-driven disease modeling. Alzheimers Dement. 2021.

  22. Birkenbihl C, Emon MA, Vrooman H, Westwood S, Lovestone S, et al. Differences in cohort study data affect external validation of artificial intelligence models for predictive diagnostics of dementia - lessons for translation into clinical practice. EPMA. 2020;11(3):367–76.

    Article  Google Scholar 

  23. Salimi Y, Domingo-Fernandez D, Bobis-Alvarez C, Hofmann-Apitius M, Vasculature I, Birkenbihl C, et al. ADataViewer: exploring semantically harmonized Alzheimer’s disease cohort datasets. medRxiv. 2021.

  24. McKhann G, Drachman D, Folstein M, Katzman R, Price D, Stadlan EM. Clinical diagnosis of Alzheimer’s disease: report of the NINCDS-ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer’s Disease. Neurology. 1984;34(7):939–44.

    Article  CAS  PubMed  Google Scholar 

  25. Iwatsubo T. Japanese Alzheimer’s Disease Neuroimaging Initiative: present status and future. Alzheimer Dement. 2010;6(3):297–9.

    Article  Google Scholar 

  26. Ellis KA, Bush AI, Darby D, De Fazio D, Foster J, Hudson P, et al. The Australian Imaging, Biomarkers and Lifestyle (AIBL) study of aging: methodology and baseline characteristics of 1112 individuals recruited for a longitudinal study of Alzheimer’s disease. Int Psychogeriatr. 2009;21(4):672–87.

    Article  PubMed  Google Scholar 

  27. Besser L, Kukull W, Knopman DS, Chui H, Galasko D, Weintraub S, et al. Version 3 of the National Alzheimer’s Coordinating Center’s Uniform Data Set. Alzheimer Dis Assoc Disord. 2018;32(4):351–8.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Birkenbihl C, Westwood S, Shi L, Nevado-Holgado A, Westman E, Lovestone S, et al. ANMerge: a comprehensive and accessible Alzheimer’s disease patient-level dataset. J Alzheimers Dis. 2021;79(1):423–31.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Bos I, Vos S, Vandenberghe R, Scheltens P, Engelborghs S, Frisoni G, et al. The EMIF-AD Multimodal Biomarker Discovery study: design, methods and cohort characteristics. Alzheimers Res Ther. 2018;10(1):1–9.

    Article  CAS  Google Scholar 

  30. Brueggen K, Grothe MJ, Dyrba M, Fellgiebel A, Fischer F, Filippi M, et al. The European DTI Study on Dementia—a multicenter DTI and MRI study on Alzheimer’s disease and mild cognitive impairment. NeuroImage. 2017;144:305–8.

    Article  PubMed  Google Scholar 

  31. Frisoni GB, Prestia A, Zanetti O, Galluzzi S, Romano M, Cotelli M, et al. Markers of Alzheimer’s disease in a population attending a memory clinic. Alzheimers Dement. 2009;5(4):307–17.

    Article  PubMed  Google Scholar 

  32. Marcus DS, Wang TH, Parker J, Csernansky JG, Morris JC, Buckner RL. Open Access Series of Imaging Studies (OASIS): cross-sectional MRI data in young, middle aged, nondemented, and demented older adults. J Cogn Neurosci. 2007;19(9):1498–507.

    Article  PubMed  Google Scholar 

  33. Marcus DS, Fotenos AF, Csernansky JG, Morris JC, Buckner RL. Open access series of imaging studies: longitudinal MRI data in nondemented and demented older adults. J Cogn Neurosci. 2010;22(12):2677–84.

    Article  PubMed  PubMed Central  Google Scholar 

  34. Damulina A, Pirpamer L, Seiler S, Benke T, Dal-Bianco P, Ransmayr G, et al. White matter hyperintensities in Alzheimer’s disease: a lesion probability mapping study. J Alzheimers Dis. 2019;68(2):789–96.

    Article  PubMed  Google Scholar 

  35. Firth NC, Primativo S, Brotherhood E, Young AL, Yong K, Crutch SJ, et al. Sequences of cognitive decline in typical Alzheimer’s disease and posterior cortical atrophy estimated using a novel event-based model of disease progression. Alzheimers Demen. 2020;16(7):965–73.

    Article  Google Scholar 

  36. DeConde RP, Hawley S, Falcon S, Clegg N, Knudsen B, Etzioni R. Combining results of microarray experiments: a rank aggregation approach. Stat Appl Genet Mol Biol. 2006;5:Article15.

    Article  PubMed  Google Scholar 

  37. Lin S, Ding J. Integration of ranked lists via cross entropy Monte Carlo with applications to mRNA and microRNA studies. Biometrics. 2009;65(1):9–18.

    Article  CAS  PubMed  Google Scholar 

  38. Ferreira D, Nordberg A, Westman E. Biological subtypes of Alzheimer disease: a systematic review and meta-analysis. Neurology. 2020;94(10):436–48.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Whitwell JL, Jack CR Jr, Przybelski SA, Parisi JE, Senjem ML, Boeve BF, et al. Temporoparietal atrophy: a marker of AD pathology independent of clinical diagnosis. Neurobiol Aging. 2011;32(9):1531–41.

    Article  PubMed  Google Scholar 

  40. Piaceri I, Nacmias B, Sorbi S. Genetics of familial and sporadic Alzheimer’s disease. Front Biosci. 2013;5(1):167–77.

    Article  Google Scholar 

  41. Lemprière S. APOE4 provokes tau aggregation via inhibition of noradrenaline transport. Nat Rev Neurol. 2021;17(6):328.

    Article  CAS  PubMed  Google Scholar 

  42. Baek MS, Cho H, Lee HS, Lee JH, Ryu YH, Lyoo CH. Effect of APOE ε4 genotype on amyloid-β and tau accumulation in Alzheimer’s disease. Alzheimer's Res Ther. 2020;12(1):1–12.

    Article  CAS  Google Scholar 

  43. Benson GS, Bauer C, Hausner L, Couturier S, Lewczuk P, Peters O, et al. Don’t forget about tau: the effects of ApoE4 genotype on Alzheimer’s disease cerebrospinal fluid biomarkers in subjects with mild cognitive impairment—data from the Dementia Competence Network. J Neural Transm. 2022:1–10.

  44. Ferreira D, Verhagen C, Hernández-Cabrera JA, Cavallin L, Guo CJ, Ekman U, et al. Distinct subtypes of Alzheimer’s disease based on patterns of brain atrophy: longitudinal trajectories and clinical applications. Sci Rep. 2017;7:46263.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Iturria-Medina Y, Sotero RC, Toussaint PJ, Mateos-Pérez JM, Evans AC, et al. Early role of vascular dysregulation on late-onset Alzheimer’s disease based on multifactorial data-driven analysis. Nat Commun. 2016;7:11934.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Chen G, Shu H, Chen G, Ward BD, Antuono PG, Zhang Z, et al. Staging Alzheimer’s disease risk by sequencing brain function and structure, cerebrospinal fluid, and cognition biomarkers. J Alzheimers Dis. 2016;54(3):983–93.

    Article  PubMed  PubMed Central  Google Scholar 

  47. Mormino EC, Kluth JT, Madison CM, Rabinovici GD, Baker SL, Miller B, et al. Episodic memory loss is related to hippocampal-mediated beta-amyloid deposition in elderly subjects. Brain. 2009;132(Pt 5):1310–23.

    Article  CAS  PubMed  Google Scholar 

  48. Wang F, Gordon BA, Ryman DC, Ma S, Xiong C, Hassenstab J, et al. Cerebral amyloidosis associated with cognitive decline in autosomal dominant Alzheimer disease. Neurology. 2015;85(9):790–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Schneider JA, Arvanitakis Z, Bang W, Bennett DA. Mixed brain pathologies account for most dementia cases in community-dwelling older persons. Neurology. 2007;69(24):2197–204.

    Article  PubMed  Google Scholar 

  50. Luo J, Agboola F, Grant E, Masters CL, Albert MS, Johnson SC, et al. Sequence of Alzheimer disease biomarker changes in cognitively normal adults: a cross-sectional study. Neurology. 2020;95(23):e3104–16.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Ellis KA, Lim YY, Harrington K, Ames D, Bush AI, Darby D, et al. Decline in cognitive function over 18 months in healthy older adults with high amyloid-β. J Alzheimers Dis. 2013;34(4):861–71.

    Article  CAS  PubMed  Google Scholar 

  52. Hadjichrysanthou C, Evans S, Bajaj S, Siakallis LC, McRae-McKee K, de Wolf F, et al. The dynamics of biomarkers across the clinical spectrum of Alzheimer’s disease. Alzheimer's Res Ther. 2020;12(1):1–16.

    Article  Google Scholar 

  53. Armstrong NM, An Y, Shin JJ, Williams OA, Doshi J, Erus G, et al. Associations between cognitive and brain volume changes in cognitively normal older adults. Neuroimage. 2020;223:117289.

  54. Herukka SK, Pennanen C, Soininen H, Pirttilä T. CSF Abeta42, tau and phosphorylated tau correlate with medial temporal lobe atrophy. J Alzheimers Dis. 2008;14(1):51–7.

    Article  CAS  PubMed  Google Scholar 

  55. Granadillo E, Paholpak P, Mendez MF, Teng E. Visual ratings of medial temporal lobe atrophy correlate with CSF tau indices in clinical variants of early-onset Alzheimer disease. Dement Geriatr Cogn Disord. 2017;44(1-2):45–54.

    Article  CAS  PubMed  Google Scholar 

  56. Bouwman FH, Schoonenboom SN, van der Flier WM, van Elk EJ, Kok A, Barkhof F, et al. CSF biomarkers and medial temporal lobe atrophy predict dementia in mild cognitive impairment. Neurobiolaging. 2007;28(7):1070–4.

    Article  CAS  Google Scholar 

  57. Younes L, Albert M, Miller MI, BIOCARD Research Team. Inferring changepoint times of medial temporal lobe morphometric change in preclinical Alzheimer’s disease. NeuroImage Clin. 2014;5:178–87.

    Article  PubMed  PubMed Central  Google Scholar 

  58. Coupé P, Manjón JV, Lanuza E, Catheline G. Lifespan changes of the human brain in Alzheimer’s disease. Sci Rep. 2019;9(1):3998.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Scahill RI, Schott JM, Stevens JM, Rossor MN, Fox NC. Mapping the evolution of regional atrophy in Alzheimer’s disease: unbiased analysis of fluid-registered serial MRI. Proc Natl Acad Sci U S A. 2002;99(7):4703–7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Braak H, Braak E. Neuropathological stageing of Alzheimer-related changes. Acta Neuropathol. 1991;82(4):239–59.

    Article  CAS  PubMed  Google Scholar 

  61. Storey E, Slavin MJ, Kinsella GJ. Patterns of cognitive impairment in Alzheimer’s disease: assessment and differential diagnosis. Front Biosci. 2002;7:e155–84.

    Article  PubMed  Google Scholar 

  62. Breteler MM, van Amerongen NM, van Swieten JC, Claus JJ, Grobbee DE, van Gijn J, et al. Cognitive correlates of ventricular enlargement and cerebral white matter lesions on magnetic resonance imaging. The Rotterdam Study. Stroke. 1994;25(6):1109–15.

    Article  CAS  PubMed  Google Scholar 

  63. Young J, Modat M, Cardoso MJ, Mendelson A, Cash D, Ourselin S. Accurate multimodal probabilistic prediction of conversion to Alzheimer’s disease in patients with mild cognitive impairment. NeuroImage Clin. 2013;2:735–45.

    Article  PubMed  PubMed Central  Google Scholar 

  64. Ferreira D, Pereira JB, Volpe G, Westman E. Subtypes of Alzheimer’s disease display distinct network abnormalities extending beyond their pattern of brain atrophy. Front Neurol. 2019;10:524.

    Article  PubMed  PubMed Central  Google Scholar 

  65. Birkenbihl C, Salimi Y, Domingo-Fernándéz D, Lovestone S, AddNeuroMed consortium, Fröhlich H, et al. Evaluating the Alzheimer’s disease data landscape. Alzheimer's Dementia: Translat Res Clin Interv. 2020;6(1):e12102.

    Google Scholar 

Download references


We want to commend all data owners on their adherence to open science principles by sharing their data. We believe that their commitment is invaluable for AD research.

Data collection and sharing for this project were funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI; National Institutes of Health Grant U01 AG024904) and DOD ADNI (Department of Defense award number W81XWH-12-2-0012). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering, and through generous contributions from the following: AbbVie; Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F.Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd; Janssen Alzheimer Immunotherapy Research Development, LLC; Johnson Johnson Pharmaceutical Research Development LLC; Lumosity; Lundbeck; Merck Co., Inc.; Meso Scale Diagnostics, LLC; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics. The Canadian Institutes of Health Research is providing funds to support ADNI clinical sites in Canada. Private-sector contributions are facilitated by the Foundation for the National Institutes of Health ( The grantee organization is the Northern California Institute for Research and Education, and the study is coordinated by the Alzheimer’s Therapeutic Research Institute at the University of Southern California. ADNI data are disseminated by the Laboratory for NeuroImaging at the University of Southern California.

Data collection and sharing of ARWIBO was supported by the Italian Ministry of Health, under the following grant agreements: Ricerca Corrente IRCCS Fatebenefratelli, Linea di Ricerca 2; Progetto Finalizzato Strategico 2000-2001 “Archivio normativo italiano di morfometria cerebrale con risonanza magnetica (età 40+)”; Progetto Finalizzato Strategico 2000-2001 “Decadimento cognitivo lieve non dementigeno: stadio preclinico di malattia di Alzheimer e demenza vascolare. Caratterizzazione clinica, strumentale, genetica e neurobiologica e sviluppo di criteri diagnostici utilizzabili nella realtà nazionale”; Progetto Finalizzata 2002 “Sviluppo di indicatori di danno cerebrovascolare clinicamente significativo alla risonanza magnetica strutturale”; Progetto Fondazione CARIPLO 2005-2007 “Geni di suscettibilità per gli endofenotipi associati a malattie psichiatriche e dementigene”; “Fitness and Solidarietà”; and anonymous donors.

J-ADNI was supported by the following grants: Translational Research Promotion Project from the New Energy and Industrial Technology Development Organization of Japan; Research on Dementia, Health Labor Sciences Research Grant; Life Science Database Integration Project of Japan Science and Technology Agency; Research Association of Biotechnology (contributed by Astellas Pharma Inc., Bristol-Myers Squibb, Daiichi-Sankyo, Eisai, Eli Lilly and Company, Merck-Banyu, Mitsubishi Tanabe Pharma, Pfizer Inc., Shionogi Co., Ltd., Sumitomo Dainippon, and Takeda Pharmaceutical Company), Japan, and a grant from an anonymous foundation.

The NACC database is funded by NIA/NIH Grant U01 AG016976. NACC data are contributed by the NIA-funded ADCs: P30 AG019610 (PI Eric Reiman, MD), P30 AG013846 (PI Neil Kowall, MD), P30 AG062428-01 (PI James Leverenz, MD) P50 AG008702 (PI Scott Small, MD), P50 AG025688 (PI Allan Levey, MD, PhD), P50 AG047266 (PI Todd Golde, MD, PhD), P30 AG010133 (PI Andrew Saykin, PsyD), P50 AG005146 (PI Marilyn Albert, PhD), P30 AG062421-01 (PI Bradley Hyman, MD, PhD), P30 AG062422-01 (PI Ronald Petersen, MD, PhD), P50 AG005138 (PI Mary Sano, PhD), P30 AG008051 (PI Thomas Wisniewski, MD), P30 AG013854 (PI Robert Vassar, PhD), P30 AG008017 (PI Jeffrey Kaye, MD), P30 AG010161 (PI David Bennett, MD), P50 AG047366 (PI Victor Henderson, MD, MS), P30 AG010129 (PI Charles DeCarli, MD), P50 AG016573 (PI Frank LaFerla, PhD), P30 AG062429-01(PI James Brewer, MD, PhD), P50 AG023501 (PI Bruce Miller, MD), P30 AG035982 (PI Russell Swerdlow, MD), P30 AG028383 (PI Linda Van Eldik, PhD), P30 AG053760 (PI Henry Paulson, MD, PhD), P30 AG010124 (PI John Trojanowski, MD, PhD), P50 AG005133 (PI Oscar Lopez, MD), P50 AG005142 (PI Helena Chui, MD), P30 AG012300 (PI Roger Rosenberg, MD), P30 AG049638 (PI Suzanne Craft, PhD), P50 AG005136 (PI Thomas Grabowski, MD), P30 AG062715-01 (PI Sanjay Asthana, MD, FRCP), P50 AG005681 (PI John Morris, MD), P50 AG047270 (PI Stephen Strittmatter, MD, PhD).


This project has received funding from the European Union’s Horizon 2020 research and innovation program under grant agreement No. 826421, “TheVirtualBrain-Cloud,” and under grant agreement No. 666992 “EuroPOND”. NPO is a UKRI Future Leaders Fellow (MR/S03546X/1). Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations



Corresponding author

Correspondence to Sepehr Golriz Khatami.

Ethics declarations

Ethics approval and consent to participate

Participants of every cohort dataset that was used in this work gave informed written consent for data collection and sharing. For more details, we refer to the provided references of each cohort, respectively.

Consent for publication

The authors submitted the manuscript to all data owners who require manuscript approval prior to publication and acquired consent.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Alzheimer’s Disease Neuroimaging Initiative: Data used in the preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database ( As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in the analysis or writing of this report. A complete listing of ADNI investigators can be found at

Japanese Alzheimer’s Disease Neuroimaging Initiative: Data used in the preparation of this article were obtained from the Japanese Alzheimer’s Disease Neuroimaging Initiative (J-ADNI) database deposited in the National Bioscience Database Center Human Database, Japan (Research ID: hum0043.v1, 2016). As such, the investigators within J-ADNI contributed to the design and implementation of J-ADNI and/or provided data but did not participate in the analysis or writing of this report. A complete listing of J-ADNI investigators can be found at:

Data used in the preparation of this article were obtained from the Alzheimer’s Disease Repository Without Borders (ARWiBo) database ( As such, the researchers within the ARWiBo contributed to the design and implementation of ARWiBo and/or provided data but did not participate in the analysis or writing of this report. A complete listing of ARWiBo researchers can be found at:

Supplementary Information

Rights and permissions

This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Golriz Khatami, S., Salimi, Y., Hofmann-Apitius, M. et al. Comparison and aggregation of event sequences across ten cohorts to describe the consensus biomarker evolution in Alzheimer’s disease. Alz Res Therapy 14, 55 (2022).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: