Validation study of “Santé-Cerveau”, a digital tool for early cognitive changes identification
Alzheimer's Research & Therapy volume 15, Article number: 70 (2023)
There is a need for a reliable, easy-to-use, widely available, and validated tool for timely cognitive impairment identification. We created a computerized cognitive screening tool (Santé-Cerveau digital tool (SCD-T)) including validated questionnaires and the following neuropsychological tests: 5 Word Test (5-WT) for episodic memory, Trail Making Test (TMT) for executive functions, and a number coding test (NCT) adapted from the Digit Symbol Substitution Test for global intellectual efficiency. This study aimed to evaluate the performance of SCD-T to identify cognitive deficit and to determine its usability.
Three groups were constituted including 65 elderly Controls, 64 patients with neurodegenerative diseases (NDG): 50 AD and 14 non-AD, and 20 post-COVID-19 patients. The minimum MMSE score for inclusion was 20. Association between computerized SCD-T cognitive tests and their standard equivalent was assessed using Pearson's correlation coefficients. Two algorithms (a simple clinician-guided algorithm involving the 5-WT and the NCT; and a machine learning classifier based on 8 scores from the SCD-T tests extracted from a multiple logistic regression model, and data from the SCD-T questionnaires) were evaluated. The acceptability of SCD-T was investigated through a questionnaire and scale.
AD and non-AD participants were older (mean ± standard deviation (SD): 72.61 ± 6.79 vs 69.91 ± 4.86 years old, p = 0.011) and had a lower MMSE score (Mean difference estimate ± standard error: 1.74 ± 0.14, p < 0.001) than Controls; post-COVID-19 patients were younger than Controls (mean ± SD: 45.07 ± 11.36 years old, p < 0.001). All the computerized SCD-T cognitive tests were significantly associated with their reference version. In the pooled Controls and NDG group, the correlation coefficient was 0.84 for verbal memory, -0.60 for executive functions, and 0.72 for global intellectual efficiency. The clinician-guided algorithm demonstrated 94.4% ± 3.8% sensitivity and 80.5% ± 8.7% specificity, and the machine learning classifier 96.8% ± 3.9% sensitivity and 90.7% ± 5.8% specificity. The acceptability of SCD-T was good to excellent.
We demonstrate the high accuracy of SCD-T in screening cognitive disorders and its good acceptance even in individuals with prodromal and mild dementia stages. SCD-T would be useful in primary care to faster refer subjects with significant cognitive impairment (and limit unnecessary referrals) to specialized consultation, improve the AD care pathway and the pre-screening in clinical trials.
Alzheimer's disease (AD) affects nearly one million people in France and represents a major public health issue. Less than half of the patients are diagnosed in France with a mean Mini Mental State Examination (MMSE) score of 17 at the time of diagnosis , i.e. at a moderately advanced stage of disease. Early diagnosis of cognitive and memory impairment is an important health concern due to the impact on patients, caregivers, and healthcare systems [2, 3].
Early detection is essential to set up a care plan, prevent risky behaviors, and anticipate complications arising from neurodegenerative diseases. Moreover, from a therapeutic point of view, preventive measures are already applicable in the life course , and AD disease-modifiers molecules are developed for the earliest stages of the disease (MMSE > 20). For instance, the current anti-amyloid immunotherapies phase 3 pre-registration randomized-controlled clinical trials closest to approval involve only individuals with prodromal AD and mild AD dementia . If these drugs were to be approved in the upcoming years, the need for wider access for the population to an early-stage biomarker-proven AD diagnosis would be huge .
However, the need for an early-stage diagnosis of AD comes up against several difficulties concerning the mobilization of patients, their families, and general practitioners who often need to be convinced of its value. General practitioners do not always have the time, training, or tools to do this [7, 8], notably because it may be difficult to distinguish between subjective memory complaints and objective memory deficits in the absence of formal memory testing . Cognitive complaints are widespread in the elderly population; however, cognitive disorders are under-diagnosed , primarily due to the lack of cognitive assessment in primary care or to the use of scales inaccurate at detecting early-stage dementia . To rebalance these two observations, it would be useful for any person with a cognitive complaint to have access to an objective and reliable evaluation in primary care. Brief computerized cognitive testing may be an option, and many tools are available today. However, most of them still need to be validated in large, controlled study settings, to allow their widespread use [11, 12].
We have developed, in partnership with MindMaze France, the "Santé-Cerveau" digital tool (SCD-T), including questionnaires and three cognitive tests adapted from validated paper/pencil versions: the 5 Word Test (5-WT), the Trail Making Test (TMT), and the Number Coding Test (NCT) adapted from the Digit Symbol Substitution Test (DSST), selected for their ability to assess significant cognitive functions e.g., episodic memory, executive functions and general intellectual efficiency, respectively. Their dysfunction is a proven early marker of AD and related disorders.
The objectives of the study were to evaluate 1) the SCD-T concordance with standard neuropsychological testing, 2) its performance to identify significant cognitive impairment in three different groups of individuals: controls, patients with neurodegenerative diseases, and subjects with a cognitive complaint after SARS-CoV-2 infection. Many subjects reported cognitive complaint after COVID-19 recovery, which was shown to be associated with affective symptoms (anxiety, depression, fatigue) or with impairment in a wide range of cognitive domains (executive functions, speed of processing, attention, memory, and processing abilities). The cognitive symptoms do not seem to be part of a neurodegenerative process, but could be related to functional, grey and white matter changes following axonal damage, inflammation or reduced perfusion . Therefore, the pandemic occurrence provides an opportunity for us to evaluate the performance of our digital tool for discriminating memory complaint related to affective or attentional/executive disorders from memory complaints related to a true amnestic syndrome of AD. 3) its acceptability by users.
To study the capacity of SCD-T to identify significant cognitive impairment and to discriminate changes associated with Alzheimer disease, we included in the study: i) subjects with cognitive deficits previously established by a comprehensive neuropsychological battery (CNB) and with a defined diagnosis based on our clinical work-up, i.e., patients with Alzheimer disease and non-Alzheimer neurodegenerative diseases at early clinical stages; and 2) ii) control subjects with normal cognitive functioning, with or without memory complaint. In addition, as mentioned above, the occurrence of the pandemic COVID-19 infection was an opportunity to include a third group of subjects with SARS CoV2 infection. All the participants (n = 149) in the SCD-T validation study were consecutively recruited in the context of clinical routine at the IM2A between February 2020 and April 2021.
To be included in the study, all participants had to be between 60 and 85 years of age (excepted for post-COVID-19 patients who had no age limits), registered to the French National Health Insurance system, signed the written consent form, be a native French speaker, with > 7 years of education, and have a MMSE score ≥ 20 points. Patients with a known neurological condition other than AD or related diseases, history of neoplasia or cerebral radiotherapy, developmental disorders or severe psychiatric illness (including severe depressive syndromes), history of head trauma or stroke with sequelae were excluded. We also excluded the subjects with addiction (alcohol or drugs), visual or auditory sensory deficits that could prevent the performance of cognitive tests, and subjects taking medication at doses known to interfere with memory and concentration.
All the participants of this study were evaluated with a comprehensive neuropsychological battery (CNB). In case of an impaired cognitive performance, the clinical routine diagnostic work-up was completed with a psychological interview, a 18F-FDG PET-MRI, and a lumbar puncture for cerebrospinal fluid core AD biomarkers investigation (Aβ1-42 and total and phosphorylated tau proteins). The lumbar punctures were performed and analyzed based on a method described elsewhere . The diagnosis was established after an interdisciplinary discussion according to international criteria [15,16,17,18,19,20].
Patients with neurodegenerative disease (NDG) (n = 64)
The group included 50 patients with AD according to the International Working Group 2021 criteria  and confirmed by positive CSF biomarkers (25 patients at a prodromal stage, 25 patients at a mild dementia stage (MMSE between 20 and 26)), 14 patients with a related degenerative disease with normal CSF biomarkers, including 6 patients with Lewy body dementia , 3 patients with primary primary progressive aphasia , 2 patients with frontotemporal dementia , 2 patients with non-AD amnestic syndrome and 1 patient with cortico-basal degeneration .
Controls (n = 65)
The group consisted of cognitively unimpaired individuals (normal performance on the CNB): 31 subjects with no memory complaints and 34 subjects with memory complaint and a CSF AD biomarkers investigation (32 within the normal range and 2 with abnormal levels (corresponding to individuals ‘asymptomatic at risk’ for AD, or preclinical AD).
Post-COVID-19 patients (n = 20)
During the COVID-19 pandemic, we included 20 individuals with persistent cognitive complaints after 3 to 6 months post-SARS-CoV-2 infection, classified as post-COVID-19 condition . We enrolled all the individuals with a SARS-CoV-2 infection confirmed by an RT-PCR test referred to IM2A for cognitive testing. The inclusion period was between November 2020 and April 2021.
"Santé-Cerveau" digital tool (SCD-T)
SCD-T is a CE marking, class I digital medical device. In this validation study, participants were presented with the prototype version of this device developed with MindMaze France. This tool is accessible from a web platform and is performed on a touch tablet (Android operating system; Samsung® Galaxy Tab S5e®; 10.5-inch screen) equipped with a standard-size headset (brand name) with an integrated microphone. Data recorded on SCD-T (test results, questionnaire responses) were transferred to the Curapy platform (www.Curapy.com) and available to the physician in the form of individual, automatically generated reports.
This platform, developed by MindMaze France, allows for data security (encryption, logging, secure operation) and uses an approved health data server (AZNetwork).
SCD-T includes questionnaires and cognitive tests to assess the intensity of the memory complaint, comorbid conditions, and the detection of objective memory and cognitive impairment.
They include the participants’ socio-demographic (age, sex, level of education) and basic medical (personal medical history, family history of neurodegenerative disorder, and current treatments) data, the intensity of the memory complaint assessed with the Mac Nair 15-item scale , and the mood status measured with the 15-item Geriatric Depression Scale (GDS) . Thus, the questionnaires consider factors associated with cognitive impairment, dementia and Alzheimer disease. Among them, some factors are modifiable and as such of particular interest to treat, as hypertension, diabetes, cardiovascular diseases and depression .
The neuropsychological tests
We selected three cognitive tests for their ability to assess the main cognitive functions known to be altered at the early stage of AD and related disorders (global intellectual efficiency, episodic memory, and executive functions). These validated paper/pencil neuropsychological tests were adapted and integrated into a digital version as close as possible to their original version, in terms of presentation, time spent on the test, and content of the instructions. Each of the evaluation tests was preceded by a presentation of the instructions with a video example, followed by a short training phase.
The number coding test (NCT) adapted from the Digit Symbol Substitution Test (DSST)  was proposed to test the overall intellectual efficiency through the speed of central processing and execution, visuospatial, and working memory functions. The DSST is presented as the most accurate predictor of brain dysfunction among the other Wechsler Adult Intelligence Scale (WAIS) subtests and is therefore considered a good tool to identify cognitive deficits in the older adult population . Moreover, it has been one of the tests used to detect early cognitive changes associated with the progression from preclinical to prodromal stage of AD; it is also sensitive to show cognitive decline in prodromal and mild dementia . In the digital version, a series of numbers were presented on the screen; the individual was instructed to associate each number with a symbol by selecting it from a list, using the code provided at the top of the screen. The number of total, good and wrong answers over a 2-min test was recorded.
The trail making test (TMT)  was proposed to assess executive functions known to be early impaired in AD  and in related disorders, such as frontotemporal dementias, cortico-basal and progressive supranuclear palsy syndromes . In the digital version, the participant clicked on the numbers on the screen as quickly as possible following an ascending order (TMT-A), then alternated from the series of numbers following an ascending order, to the series of letters following an alphabetical order (TMT-B). We recorded the execution time (in seconds) of each parts A and B of the test and the calculated time of part B minus part A.
The 5 Word Test (5-WT)  is based on the principle of semantic cueing to identify an amnestic syndrome of the hippocampal type (ASHT) [30, 31]. ASHT was shown to be highly associated with AD pathology . In the digital version, a list of five words was presented that the user has to read aloud and to name in response to their categories. An immediate recall of the five words, using an automated voice recognition system, controls for the correct encoding of the five words. The delayed recall of the list, with automated voice recognition, occurs after five minutes, corresponding to the NCT and TMT period. Semantic cues are used in the test phase to prompt recall of items not retrieved by free recall (cued recall). The scores include a raw total score (total words recalled: cued and free recall, in the immediate and delayed recall phases), and a weighted total score (total cued recalls + 2 × total free recalls).
SCD-T cognitive tests were performed in the following order: the 5-WT immediate recall (encoding phase), the TMT parts A and B, the NCT, and 5-WTdelayed recall (retrieval phase).
Cognitive test conditions
SCD-T was performed in a quiet environment, in the presence of an investigator who intervened only at the beginning and end of the session to launch and close the app. The participant filled in the different questionnaires and performed the cognitive tests proposed alone.
Standard comprehensive neuropsychological battery (CNB)
All participants underwent a reference CNB that differed between the groups of participants according to their clinical status.
CNB for patients with neurodegenerative diseases (NDG)
The tests and questionnaires were part of the standard diagnostic and follow-up procedures of the Pitié-Salpêtrière Memory Clinic (Institut de la Mémoire et de la Maladie d’Alzheimer- IM2A). The cognitive testing included: the Mini Mental State Examination (MMSE) , the Digit and Visuo-spatial Spans , the 40-items semantic battery (BECS-GRECO) , the Free and Cued Selective Reminding Test (FCSRT) , a praxis assessment , the Frontal Assessment Battery (FAB) , the TMT part A and B , the Rey Complex Figure , a verbal fluency assessment . The behavioral testing included: the Hospital Anxiety and Depression (HAD) scale , the Starkstein Apathy scale . The functional testing included: the Instrumental Activities of Daily Living (IADL)  and the Amsterdam Instrumental Activity of Daily Living Questionnaire . SCD-T was performed within four months of the CNB.
CNB for Control group
A reduced CNB was proposed. It consisted of six cognitive tests: the MMSE , the FCSRT , the subtest code of the WAIS-IV , the Paced Auditory Serial Addition Test (PASAT) , the FAB , and the Stroop Test . This cognitive testing was performed within four months of SCD-T, to avoid the retest effect and interference with memory testing.
CNB for post-COVID-19 patients
We adapted the CNB in line with the first cognitive and emotional reports from COVID-19 patients . The cognitive testing included: the MMSE , the Digit and Visuospatial Spans , the FCSRT , the Delayed Matching to Sample Task 48 , the PASAT , the DSST , the FAB , the TMT parts A and B , the Stroop test , a verbal fluency assessment , and the Facial Action Coding System . The emotional and behavioral testing included: the Posttraumatic Stress disorder Checklist Scale , the Chalder fatigue scale , the HAD scale , and the French apathy Dimensional Scale . SCD-T was performed within one month of the CNB.
All the participants completed an unpublished Questionnaire on Cognitive Tests (QCT) to test the acceptability of SCD-T. The QCT included 5 questions: (1) How do you think you did on these cognitive tests compared to others of your age? (2) Do you think tests’ results represent your memory and attention? (3) How did you feel during the tests? (4) Were the instructions clear? (5) Would it have been helpful if someone had explained the tests to you and answered your questions before you took them? The participants answered these questions by choosing an answer among 5 options. They also completed the System Usability Scale (SUS)  to test the usability of SCD-T. This scale includes 10 questions with 1 to 5 Likert-scale answers, from strongly disagree  to strongly agree . The SUS score varies from 0 to 100 and is considered as "excellent" if equal or higher to 86, good if ≥ 73, acceptable if ≥ 52 .
Demographic and clinical characteristics were compared using Welch's t test for quantitative measures and Fisher's exact test for categorical measures, regarding Controls and NDG groups. To compare cognitive tests from SCD-T, those from CNB, as well as acceptability measures between both groups, we performed generalized linear models with age, gender, and education level (with three levels; level 1: ≤ 12 years of education, under the high school diploma; level 2: 13 to 17 years of education, between high school diploma and Master’s degree; level 3: > 17 years of education, higher than Master’s degree); and clinical group as independent variables and each measure as the dependent variable. To correct for multiple testing, the Benjamini–Hochberg procedure was applied. Besides, comparisons between Controls and post-COVID-19 group were performed. To account for the age discrepancy between these two groups (mean ± SD, Controls: 69.9 ± 4.9 vs post-COVID-19: 45.1 ± 11.4, p < 0.001), we transformed the raw scores into standardized scores from the reference CNB using validated norms, controlling either by age, education level, or both. Since, norms for tests with SCD-T execution do not exist, we used those from the reference CNB execution. Due to missing norms, especially for the middle-aged population, several standardized test scores could not be computed (DSST bad and total answers). To compare the standardized scores of the groups, we used the Mann–Whitney U test and corrected for multiple testing using the Benjamini–Hochberg procedure. SCD-T scores were compared to their equivalent in the reference CNB using Pearson's correlation coefficients in a pooled Controls and NDG group and in post-COVID-19 subjects. Correlations were performed between (i) NCT good answers and MMSE; (ii) TMT B-A time and FAB; (iii) total 5-WT score and FCSRT total recall. The Benjamini–Hochberg method was used to correct the p-values for statistical test multiplicity. Same approach was performed to compare NDG and post-COVID-19 groups on 5-WT.
The performances of SCD-T to discriminate NDG from Controls were studied through the development of two algorithms: one guided by clinicians and another one without a priori (i.e., classical machine learning classifier). The clinician-guided algorithm is intended to provide a reliable estimate of the clinical signature of AD. We first used the previously established 5-WT clinical threshold of 9 to identify an amnesic syndrome of hippocampal type. Second, we aimed to identify individuals with a dysexecutive syndrome (using NCT and TMT scores) among individuals with a score of 10 at the 5-WT. The machine learning algorithm was tested using a multiple logistic regression model, including the 8 scores of the 5-WT, TMT, and NCT cognitive tests and accounted for age, gender, education level, and medical comorbidities associated with AD (hypertension, diabetes, cardiovascular diseases, and depression). Contrary to the clinician-algorithm that relies on the domain-expert’s knowledge, the machine learning classifier used different kinds of variables (continuous cognitive tests, continuous, dichotomous, and ordinal socio-demographic variables, and dichotomous comorbidities) to extract knowledge and train the algorithm. We used fivefold cross-validation to optimized the threshold and test the algorithms’ performance. For the clinician-guided algorithm, threshold optimization was performed on NCT or TMT scores for subjects with a 5-WT equal to 10 on the training set. The machine learning algorithm threshold optimization was performed on the estimated probabilities extracted from the multiple logistic regression model on the training set. For both algorithms, the threshold optimization was performed to maximize specificity for a sensitivity of at least 95%. We set this level of sensitivity to avoid false negatives, i.e. falsely reassuring someone who should be consulting, while maximizing the specificity to maintain a low number of false positives. During training, the threshold is optimized in individuals from the training set (104 individuals), and then tested on subjects previously unseen by the model (test set: 25 individuals). This approach mimics the clinical situation, where the CNB is used to diagnose a cognitive impairment in a new patient. Performance indicators as sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were assessed through the means and standard deviations of the fivefold cross-validation.
Statistical analyses were performed using R 3.6.1. (R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.)
Table 1 shows the main characteristics of the study participants. The mean age of the participants was 71.3 years, with patients in the NDG group being significantly older compared to controls (mean ± SD: 72.6 ± 6.8 vs. 69.9 ± 4.9, respectively, p = 0.011). More men were included in the NDG group (54.7% vs. 32.3%, p = 0.013). There was no significant difference in education level between the groups of participants.
The NDG group had a higher cognitive complaint score (Mac Nair 15 items; mean difference estimate (MDE) ± standard error (SE): -6.8 ± 1.3, p < 0.001) and self-rated depression score (GDS 15 items; MDE ± SE: -1.5 ± 0.5, p = 0.003).
Comparison of the performance of NDG and Control groups in SCD-T and in the reference CNB (Table 1)
NDG patients had significantly worse cognitive tests scores in the SCD-T and the reference NFT-Battery, except for the NCT wrong answers.
Association between SCD-T and reference CNB scores (Fig. 1)
All the SCD-T cognitive tests scores were significantly associated (p < 0.001) with their CNB equivalent when pooling the Controls and NDG individuals. The highest correlation coefficients observed were between the 5-WT and FCSRT total recall (r = 0.84, Fig. 1.C) and between the NCT good answers and the MMSE (r = 0.72, Fig. 1.A). The SCD-T TMT B-A correlated with the FAB (r = -0.60, Fig. 1.B).
Performances of these two-step algorithms are presented in Table 2. The first step of the two-stage ‘sieve approach’ of the clinician-guided algorithm (i.e., the identification of the ASHT using the 5-WT threshold of 9) demonstrated a 92.3% sensitivity and a 80.7% specificity. The second step identified the NCT good answers threshold of [mean ± standard deviation]: 15.0 ± 2.4 as the best discriminant between patients with a 5-WT score of 10 and Controls. As a whole, this clinician-guided algorithm had a 95.4 ± 3.8% sensibility and a 80.5 ± 8.7% specificity.
Machine learning classifier
The machine learning classifier considered 8 scores from the 3 SCD-T cognitive tests, the socio-demographic variables, and the medical comorbidities. This classifier obtained a [mean ± standard deviation] 96.8 ± 3.9% sensitivity and a 90.7 ± 5.8% specificity. The positive predictive value (PPV) was 91.6 ± 4.9% and the negative predictive value (NPV), 97.1 ± 3.6%.
SCD-T acceptability (Table 1)
The System Usability Scale (SUS) was considered excellent in the Control group (mean ± SD: 92.23 ± 10.75) and as good in the NDG group (79.49 ± 18.70) (MDE ± SE: 11.4 ± 2.8, p < 0.001). Only 4 patients and 1 control found the instructions unclear. Compared to patients from the NDG group, Control subjects were more likely to think that their performance was normal for age (61.5% vs 17.2%, Odds Ratio (OR) ± SE: 8.3 ± 3.8, p < 0.001), to have at least one good feeling during the tests (75.4% vs 48.4%, OR ± SE: 5.2 ± 2.3, p < 0.001), to think that an instructor’s explanation would not have been very useful (92.3% vs 71.9%, OR ± SE: 4.6 ± 2.6, p = 0.005). Seventy-nine percent of NDG patients and control subjects thought their results were in line with their memory and attention performance; this percentage did not significantly differ between Controls and NDG groups (84.6% vs 73.4%, OR ± SE: 1.5 ± 0.7, p = 0.395).
SCD-T in the post-COVID-19 group
Amongst the 20 post-COVID-19 individuals, 14 had at least 1 test from the reference CNB below a pathological cut-off.
After standardization, all SCD-T and reference CNB cognitive tests scores of the post-COVID-19 group were significantly worse than the Control group (Table 3).
In the post-COVID-19 group, all the SCD-T tests were significantly correlated (p < 0.05) with their equivalent in the CNB. The highest correlation coefficients were observed for the correlation between the 5-WT score and the FCSRT total recall (r = 0.67, Fig. 2.C) and the correlation between the TMT B-A and the FAB (r = -0.65, Fig. 2.B). The NCT good answers were moderately correlated with the MMSE (r = -0.46, Fig. 2.A).
After standardization, the 5-WT total scores and the 5-WT weighted total scores of the NDG participants were significantly worse than those of the 14 post-COVID-19 subjects (Table 4).
This study demonstrated that all the SCD-T cognitive tests were significantly associated with their equivalent in the clinical setting. Using a straightforward SCD-T setting (the 5-WT with a previously established threshold of 9, followed by a NCT categorization for the individuals performing with a score of 10 on 5-WT) we obtained a 95.4% sensitivity and a 80.5% specificity to discriminate NDG patients from Controls. The diagnostic performance reached a 96.8% sensitivity and a 90.7% specificity using a machine learning classifier based on 8 scores from the 3 SCD-T cognitive tests, socio-demographic variables (age, gender, education level), and medical comorbidities associated with AD (hypertension, diabetes, cardiovascular diseases, and depression). The acceptability of SCD-T was good to excellent according to the severity of the cognitive deficit.
SCD-T was developed to provide a reliable tool to timely identify mild cognitive impairment in primary care. One issue is providing a reliable and easy-to-implement automated tool in the healthcare system that can play a screening role in the general population to optimize the healthcare circuit related to cognitive disorders, as the number of people with cognitive deficits is growing . Still, this screening role must be able to identify mild or subtle cognitive impairments. The three cognitive domains triggered by the SCD-T are coherent with these objectives. The 5-WT is an episodic memory test based on the cueing of the words to be remembered. Therefore, this simple test can isolate the storage deficit characteristic of an amnestic syndrome of the hippocampal type from any other memory disorder, i.e., with a peculiar specificity for AD . In addition, the DSST relies on many executive functions such as central processing, attention, and information processing speed which reflect the overall intellectual general efficiency and is related to the cognitive impairment severity. Finally, the TMT is sensitive to early and subtle cognitive changes, which may occur in AD before the amnestic syndrome. The digital versions of each test were significantly consistent with their reference version, with the highest correlation coefficients (r = 0.84) for the 5-WT compared to the FCSRT total recall.
Both diagnostic classifier approaches tested (the clinician-guided one and the machine learning one) provided a good discriminatory capacity for patients of the NDG group with sensitivity higher than 95% and specificity higher than 80%. However, the machine learning algorithm improved by 10% the specificity (80.5 ± 8.7% to 90.7 ± 5.8%).
As our validation study took place during the COVID-19 pandemic, we included subjects with a cognitive complaint following a COVID-19 infection. This group allowed us to test SCD-T in a non-degenerative condition. 70% of the post-COVID-19 individuals tested had at least one deficit in one cognitive domain using the CNB, as reported elsewhere [57, 58]. We replicated a good correlation between the SCD-T and the CNB cognitive tests. Moreover, the performance of patients with NDG diseases was significantly worse than that of post-COVID-19 subjects with cognitive deficit on the reference CNB, demonstrating that SCD-T is able to discriminate different patterns of memory disorders (i.e., amnestic syndrome of the hippocampal type from a memory deficit due to attentional/executive disorder), thanks to the 5-WT.
In the US, screening for cognitive impairment has been encouraged at the Medicare Annual Wellness Visit . To be approved by regulatory agencies for clinical use and covered by health insurance, a cognitive screening tool needs to be robustly validated and impact the clinical diagnosis and therapeutic decisions. Our validation study’s results will help to implement SCD-T as a screening tool in the general population. In France, we plan to propose the following implementation in line with primary care physicians: SCD-T will be available through personal access after a prescription by a clinician. The results and conclusion of SCD-T will be detailed and interpreted based on the algorithms immediately at the end of the test, and available for the prescriber (mainly general practitioners) on a secured web platform with easy access. In case of abnormal results, the general practitioner can refer the subject to the local memory consultation for a more extensive assessment. In case of normal results, the general practitioners will reassure the subject and may also suggest a follow-up evaluation using SCD-T.
Besides the tool’s diagnostic performance, it is also essential that the test is acceptable for people unfamiliar with digital tools. SCD-T was designed so that any individual could complete the test alone (without direct supervision by a healthcare provider). SCD-T had a good to excellent acceptability performance, as illustrated by the high SUS scores in NDG and Control groups. As expected, the test duration was longer in subjects with more severe cognitive impairment.
Numerous digital applications designed to assess cognitive functioning exist; however, the diagnostic performances of most of them have yet to be academically evaluated [11, 59]. Compared to the sensitivities and specificities of 46 digital cognitive tests with self-administered assessment to detect cognitive impairment in elderly participants (mild cognitive impairment and dementia versus controls) , the performance of SCD-T was very high, among the best. One of the main limitations of these studies reported in the review is the limited number of subjects in the control groups, with less than 30 participants, in 15 studies .
The strengths of our study were the large group size and the detailed diagnostic workup in each group of participants. It is noteworthy that the diagnoses were validated in interdisciplinary meetings based on the neuropsychological, imaging, and CSF data. Hence, SCD-T can detect a mild cognitive impairment but, above all, an amnestic syndrome of the hippocampal type, a core phenotype of typical Alzheimer’s disease . These results are an important added value of our digital application, especially since, to the best of our knowledge, SCD-T has so far the highest diagnostic performances compared to other applications. At least, SCD-T is the only digital application that considers risk factors of cognitive impairment and dementia such as hypertension, diabetes, cardiovascular diseases and depression). By including clinical data and these modifiable risk factors, the machine learning algorithm allowed us to adjust the results provided by the cognitive tests. Moreover, these factors will be mentioned in the Curapy report to the general practitioner to act on these factors and prevent cognitive decline. However, our study had several limitations. The sample size was too small to stratify by age, education, and gender. Then, we adjusted for these three effects in the comparison between NDG and Controls, and age, education and gender were included as features in the machine learning algorithm. For this study, the Controls were selected on the absence of any cognitive impairment on the CNB. They were all cognitively normal. Some of them (34 out 65) had a memory complaint but with a negative CSF AD biomarkers investigation in 32 subjects and 2 were asymptomatic at risk. Our population was highly selected (age between 60 and 85 years, native French speaker ≥ 7 years of schooling, MMSE ≥ 20, population referred to a third care system with AD biomarkers in the CSF…) which is not representative of the subjects consulting their general practitioner for a cognitive complaint, and there was a low number of participants with a non-degenerative cognitive impairment, which is not representative of the subjects assessed in memory consultations . Besides, SCD-T mainly focuses on amnesic and dysexecutive cognitive functions, and our NDG group was mainly composed of typical AD phenotypes. SCD-T will likely be less accurate in screening non-amnestic non-dysexecutive neurodegenerative diseases, such as prodromal Primary Progressive Aphasia or Posterior Cortical Atrophy.
Hence, SCD-T is a simple and fast application with strong diagnostic performance and validated with diagnosis categorization from an expert memory center. This application allows great confidence in identifying an amnestic syndrome of the hippocampal type and, therefore, in detecting AD at a prodromal stage of the disease.
Healthcare systems need structural and functional innovation toward early detection and diagnosis of cognitive disorders, especially AD. We demonstrated that SCD-T has a high diagnostic performance in identifying prodromal neurodegenerative diseases, especially AD. This opens the opportunity to implement this tool in the general population, to test its ability to guide general practitioners in their referrals to memory clinics and avoid useless referrals of cognitively normal elderly, thus saving costs and improving pathways efficiency. It may also be helpful in pre-screening individuals to be included in AD clinical trials.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Santé-Cerveau digital tool
5 Word Test
Trail Making Test
Number coding test
Mini Mental State Examination
Geriatric Depression Scale
Digit Symbol Substitution Test
Wechsler Adult Intelligence Scale
Amnestic syndrome of the hippocampal type
Comprehensive neuropsychological battery, and functional battery
Institut de la Mémoire et de la Maladie d’Alzheimer
Free and Cued Selective Reminding Test
Frontal Assessment Battery
Hospital Anxiety and Depression
Instrumental Activities of Daily Living
Paced Auditory Serial Addition Test
- 18F-FDG PET-MRI:
18F-fluoro-deoxy-glucose positron emission tomography-magnetic resonance imaging
Severe acute respiratory syndrome coronavirus 2
Reverse transcription polymerase chain reaction
Questionnaire on Cognitive Tests
System Usability Scale
Positive predictive value
Negative predictive value
Mean difference estimate
Mean difference estimate
Epelbaum S, Paquet C, Hugon J, Dumurgier J, Wallon D, Hannequin D, et al. How many patients are eligible for disease-modifying treatment in Alzheimer’s disease? A French national observational study over 5 years. BMJ Open. 2019;9(6): e029663.
Cordell CB, Borson S, Boustani M, Chodosh J, Reuben D, Verghese J, et al. Alzheimer’s Association recommendations for operationalizing the detection of cognitive impairment during the Medicare Annual Wellness Visit in a primary care setting. Alzheimers Dement. 2013;9(2):141–50.
Barnett JH, Lewis L, Blackwell AD, Taylor M. Early intervention in Alzheimer’s disease: a health economic study of the effects of diagnostic timing. BMC Neurol. 2014;14:101.
Yu JT, Xu W, Tan CC, Andrieu S, Suckling J, Evangelou E, et al. Evidence-based prevention of Alzheimer’s disease: systematic review and meta-analysis of 243 observational prospective studies and 153 randomised controlled trials. J Neurol Neurosurg Psychiatry. 2020;91(11):1201–9.
Villain N, Planche V, Levy R. High-clearance anti-amyloid immunotherapies in Alzheimer’s disease. Part 1: Meta-analysis and review of efficacy and safety data, and medico-economical aspects. Rev Neurol. 2022;178(10):1011–30.
Villain N, Planche V, Levy R. High-clearance anti-amyloid immunotherapies in Alzheimer’s disease. Part 2: putative scenarios and timeline in case of approval, recommendations for use, implementation, and ethical considerations in France. Rev Neurol. 2022;178(10):999–1010.
Bradford A, Kunik ME, Schulz P, Williams SP, Singh H. Missed and delayed diagnosis of dementia in primary care: prevalence and contributing factors. Alzheimer Dis Assoc Disord. 2009;23(4):306–14.
Bernstein A, Rogers KM, Possin KL, Steele NZR, Ritchie CS, Kramer JH, et al. Dementia assessment and management in primary care settings: a survey of current provider practices in the United States. BMC Health Serv Res. 2019;19(1):919.
Mitchell AJ, Beaumont H, Ferguson D, Yadegarfar M, Stubbs B. Risk of dementia and mild cognitive impairment in older people with subjective memory complaints: meta-analysis. Acta Psychiatr Scand. 2014;130(6):439–51.
Lang L, Clifford A, Wei L, Zhang D, Leung D, Augustine G, et al. Prevalence and determinants of undetected dementia in the community: a systematic literature review and a meta-analysis. BMJ Open. 2017;7(2): e011146.
Chan JYC, Bat BKK, Wong A, Chan TK, Huo Z, Yip BHK, et al. Evaluation of Digital Drawing Tests and Paper-and-Pencil Drawing Tests for the Screening of Mild Cognitive Impairment and Dementia: A Systematic Review and Meta-analysis of Diagnostic Studies. Neuropsychol Rev. 2022;32(3):566–76.
Sabbagh MN, Boada M, Borson S, Doraiswamy PM, Dubois B, Ingram J, et al. Early Detection of Mild Cognitive Impairment (MCI) in an At-Home Setting. J Prev Alzheimers Dis. 2020;7(3):171–8.
Díez-Cirarda M, Yus M, Gómez-Ruiz N, Polidura C, Gil-Martínez L, Delgado-Alonso C, et al. Multimodal neuroimaging in post-COVID syndrome and correlation with cognition. Brain. 2022. https://doi.org/10.1093/brain/awac384.
Dubois B, Epelbaum S, Nyasse F, Bakardjian H, Gagliardi G, Uspenskaya O, et al. Cognitive and neuroimaging features and brain β-amyloidosis in individuals at risk of Alzheimer’s disease (INSIGHT-preAD): a longitudinal observational study. Lancet Neurol. 2018;17(4):335–46.
Dubois B, Villain N, Frisoni GB, Rabinovici GD, Sabbagh M, Cappa S, et al. Clinical diagnosis of Alzheimer’s disease: recommendations of the International Working Group. Lancet Neurol. 2021;20(6):484–96.
McKeith IG, Boeve BF, Dickson DW, Halliday G, Taylor JP, Weintraub D, et al. Diagnosis and management of dementia with Lewy bodies: Fourth consensus report of the DLB Consortium. Neurology. 2017;89(1):88–100.
Gorno-Tempini ML, Hillis AE, Weintraub S, Kertesz A, Mendez M, Cappa SF, et al. Classification of primary progressive aphasia and its variants. Neurology. 2011;76(11):1006–14.
Rascovsky K, Hodges JR, Knopman D, Mendez MF, Kramer JH, Neuhaus J, et al. Sensitivity of revised diagnostic criteria for the behavioural variant of frontotemporal dementia. Brain. 2011;134(9):2456–77.
Armstrong MJ, Litvan I, Lang AE, Bak TH, Bhatia KP, Borroni B, et al. Criteria for the diagnosis of corticobasal degeneration. Neurology. 2013;80(5):496–503.
Soriano JB, Murthy S, Marshall JC, Relan P, Diaz JV. WHO Clinical Case Definition Working Group on Post-COVID-19 Condition. A clinical case definition of post-COVID-19 condition by a Delphi consensus. Lancet Infect Dis. 2022;22(4):e102-7.
McNair D, Kahn R. Self-assessment of cognitive deficits. Assessment in geriatric psychopharmacology. New Canaan: Mark Powley Associates; 1983.
Sheikh JI, Yesavage JA. Geriatric Depression Scale (GDS). Recent evidence and development of a shorter version. In: Brink, TL., editor. Clinical Gerontology: A Guide to Assessment and Intervention. New York: The Haworth Press; 1986. p. 165–73.
Livingston G, Huntley J, Sommerlad A, Ames D, Ballard C, Banerjee S, et al. Dementia prevention, intervention, and care: 2020 report of the Lancet Commission. Lancet. 2020;396(10248):413–46.
Wechsler D. The Measurement of Adult Intelligence. Baltimore: the Williams &Wilkins Company; 1939.
Lezak. Executive functions and motor performance. In: Neuropsychological assessment. 3rd. New York: Oxford University Press; 1995. p. 650–85.
Donohue MC, Sperling RA, Salmon DP, Rentz DM, Raman R, Thomas RG, et al. The preclinical Alzheimer cognitive composite: measuring amyloid-related decline. JAMA Neurol. 2014;71(8):961–70.
Reitan R. Trail making Test: Manual for administration and scoring. Tucson: Reitan Neuropsychology Laboratory; 1992.
Jutten RJ, Sikkes SAM, Amariglio RE, Buckley RF, Properzi MJ, Marshall GA, et al. Identifying Sensitive Measures of Cognitive Decline at Different Clinical Stages of Alzheimer’s Disease. J Int Neuropsychol Soc. 2021;27(5):426–38.
Pillon B, Blin J, Vidailhet M, Deweer B, Sirigu A, Dubois B, et al. The neuropsychologicalpattern of corticobasal degeneration: comparison with progressive supranuclear palsy and Alzheimer’s disease. Neurology. 1995;45(8):1477–83.
Dubois B, Touchon J, Portet F, Ousset PJ, Vellas B, Michel B. “The 5 words”: a simple and sensitive test for the diagnosis of Alzheimer’s disease. Presse Med. 2002;31(36):1696–9.
Dubois B, Albert ML. Amnestic MCI or prodromal Alzheimer’s disease? Lancet Neurol. 2004;3(4):246–8.
Wagner M, Wolf S, Reischies FM, Daerr M, Wolfsgruber S, Jessen F, et al. Biomarker validation of a cued recall memory deficit in prodromal Alzheimer disease. Neurology. 2012;78(6):379–86.
Folstein MF, Folstein SE, McHugh PR. « Mini-mental state ». A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975;12(3):189–98.
Wechsler WD, Scale M. In: New-York. NY: Psychological Corporation; 1975.
Merck C, Charnallet A, Auriacombe S, Belliard S, Hahn-Barma V, Kremin H, et al. La batterie d’évaluation des conaissances sémantiques du GRECO (BECS-GRECO): validation et données normatives. Rev Neuropsychol. 2011;3(4):235–55.
Van der Linden M, Coyette F, Poitrenaud J, Kalafat M, Calacis F, Wyns C, et al. L’épreuve de rappel libre/rappel indicé à 16 items (RL/RI-16). In: L’évaluation des troubles de la mémoire : Présentation de quatre tests de mémoire épisodique (avec leur étalonnage). In M. Van der Liden, S. Adam, A. Agniel, C. Baisset Mouly, et al. (Eds.). Marseille : Solal; 2004.
Mahieux-Laurent F, Fabre C, Galbrun E, Dubrulle A, Moroni C, groupe de réflexion sur les praxies du CMRR Ile-de-France Sud. Validation of a brief screening scale evaluating praxic abilities for use in memory clinics Evaluation in 419 controls, 127 mild cognitive impairment and 320 demented patients. Rev Neurol. 2009;165(6–7):560–7.
Dubois B, Slachevsky A, Litvan I, Pillon B. The FAB: a Frontal Assessment Battery at bedside. Neurology. 2000;55(11):1621–6.
Tombaugh TN. Trail Making Test A and B: normative data stratified by age and education. Arch Clin Neuropsychol. 2004;19(2):203–14.
Osterrieth PA. Le test de copie d’une figure complexe; contribution à l’étude de la perception et de la mémoire. Test of copying a complex figure; contribution to the study of perception and memory. Archives de Psychologie. 1944;30:206–356.
Meulemans T. La batterie Grefex. In: Godefroy O et le Grefex, Fonctions exécutives et pathologies neurologiques et psychiatriques. Marseille: Solal; 2008. p. 217–29.
Zigmond AS, Snaith RP. The hospital anxiety and depression scale. Acta Psychiatr Scand. 1983;67(6):361–70.
Starkstein SE, Migliorelli R, Manes F, Tesón A, Petracca G, Chemerinski E, et al. The prevalence and clinical correlates of apathy and irritability in Alzheimer’s disease. Eur J Neurol. 1995;2(6):540–6.
Lawton MP, Brody EM. Assessment of older people: self-maintaining and instrumental activities of daily living. Gerontologist. 1969;9(3):179–86.
Sikkes SAM, de Lange-de Klerk ESM, Pijnenburg YAL, Gillissen F, Romkes R, Knol DL, et al. A new informant-based questionnaire for instrumental activities of daily living in dementia. Alzheimers Dement. 2012;8(6):536–43.
Wechsler D. Wechsler Adult Intelligence Scale-Fourth Edition (WAIS-IV) administration and scoring manual. In: San Antonio: The Psychological Corporation; 2008.
Naegele B, Mazza S. Test d’attention soutenue : PASAT modifié. Marseille: Solal; 2004.
Albaret JM, Migliore L. Test de Stroop. In: Éditions du Centre de psychologie appliquée (ECPA); 1999.
Kas A, Soret M, Pyatigoskaya N, Habert MO, Hesters A, Le Guennec L, et al. The cerebral network of COVID-19-related encephalopathy: a longitudinal voxel-based 18F-FDG-PET study. Eur J Nucl Med Mol Imaging. 2021;48(8):2543–57.
Barbeau E, Didic M, Tramoni E, Felician O, Joubert S, Sontheimer A, et al. Evaluation of visual recognition memory in MCI patients. Neurology. 2004;62(8):1317–22.
Ekman P, Rosenberg E. What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). In Oxford: Oxford University Press; 1997.
Cottraux J. French version of the PTSD Checklist Scale. In: Les thérapies comportementales et cognitives. Paris: Masson; 1996.
Chalder T, Berelowitz G, Pawlikowska T, Watts L, Wessely S, Wright D, et al. Development of a fatigue scale. J Psychosom Res. 1993;37(2):147–53.
M’Barek L, Radakovic R, Noquet M, Laurent A, Allain P. Different aspects of emotional processes in apathy: Application of the French translated dimensional apathy scale. Curr Psychol. 2018;39:564–70.
Brooke J. SUS: A « quick and dirty » usability scale. In: Usability evaluation in Industry. Taylor & Francis, London; 1996. p. 189‑94.
Patterson C. World Alzheimer's Report 2018. London: Alzheimer's Disease International; 2018.
Poletti S, Palladini M, Mazza MG, De Lorenzo R, Furlan R, COVID-19 BioB Outpatient Clinic Study group, et al. Long-term consequences of COVID-19 on cognitive functioning up to 6 months after discharge: role of depression and impact on quality of life. Eur Arch Psychiatry Clin Neurosci. 2022;272(5):773–82.
Chaumont H, Meppiel E, Roze E, Tressières B, de Broucker T, Lannuzel A, et al. Long-term outcomes after NeuroCOVID: A 6-month follow-up study on 60 patients. Rev Neurol. 2022;178(1–2):137–43.
Charalambous AP, Pye A, Yeung WK, Leroi I, Neil M, Thodi C, et al. Tools for App- and Web-Based Self-Testing of Cognitive Impairment: Systematic Search and Evaluation. J Med Internet Res. 2020;22(1): e14551.
Picard C, Pasquier F, Martinaud O, Hannequin D, Godefroy O. Early onset dementia: characteristics in a large cohort from academic memory clinics. Alzheimer Dis Assoc Disord. 2011;25(3):203–5.
The clinical study was supported by a grant from the Fondation Recherche Alzheimer (FRA).
Ethics approval and consent to participate
All participants gave written informed consent. The study was approved by the Ethical Committee (Comité de Protection des Personnes (CPP) du Sud-Ouest et Outre-Mer 4) (AM2020-277/02/CPP2019-10–077).
Consent for publication
SB received honoraria from Biogen for a symposium and from Esai for a lecture, and is investigator without personal fees in therapeutical trials from Biogen, Roche, Eisai, Eli Lilly, Janssen, Johnson & Johnson, Alector, UCB, NovoNordisk. SE received lecture honoraria from Roche and consultant honoraria for participating in scientific advisory boards of GE Healthcare, Astellas Pharma, Roche, Biogen and Eli Lilly. NV reports grants from Fondation Recherche Alzheimer, Département Médical Universitaire APHP–Sorbonne Université, and non-financial support from GE Healthcare, Merz Pharma, UCB Pharma, Medtronic, and Laboratoire Aguettant, outside the submitted work. BD reports personal fees from Biogen, ABScience, and grants paid to his institution from Roche, Merck-Avenir Foundation, and Fondation Recherche sur Alzheimer, outside the submitted work.
Pierre Foulon is Director and Heïdy Jean-Marie is Technical Director of Mindmaze France, a private company that develops DTx solutions.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Lesoil, C., Bombois, S., Guinebretiere, O. et al. Validation study of “Santé-Cerveau”, a digital tool for early cognitive changes identification. Alz Res Therapy 15, 70 (2023). https://doi.org/10.1186/s13195-023-01204-x