- Open Access
The Toronto Cognitive Assessment (TorCA): normative data and validation to detect amnestic mild cognitive impairment
Alzheimer's Research & Therapy volume 10, Article number: 65 (2018)
The Correction to this article has been published in Alzheimer's Research & Therapy 2018 10:120
A need exists for easily administered assessment tools to detect mild cognitive changes that are more comprehensive than screening tests but shorter than a neuropsychological battery and that can be administered by physicians, as well as any health care professional or trained assistant in any medical setting. The Toronto Cognitive Assessment (TorCA) was developed to achieve these goals.
We obtained normative data on the TorCA (n = 303), determined test reliability, developed an iPad version, and validated the TorCA against neuropsychological assessment for detecting amnestic mild cognitive impairment (aMCI) (n = 50/57, aMCI/normal cognition). For the normative study, healthy volunteers were recruited from the Rotman Research Institute registry. For the validation study, the sample was comprised of participants with aMCI or normal cognition based on neuropsychological assessment. Cognitively normal participants were recruited from both healthy volunteers in the normative study sample and the community.
The TorCA provides a stable assessment of multiple cognitive domains. The total score correctly classified 79% of participants (sensitivity 80%; specificity 79%). In an exploratory logistic regression analysis, indices of Immediate Verbal Recall, Delayed Verbal and Visual Recall, Visuospatial Function, and Working Memory/Attention/Executive Control, a subset of the domains assessed by the TorCA, correctly classified 92% of participants (sensitivity 92%; specificity 91%). Paper and iPad version scores were equivalent.
The TorCA can improve resource utilization by identifying patients with aMCI who may not require more resource-intensive neuropsychological assessment. Future studies will focus on cross-validating the TorCA for aMCI, and validation for disorders other than aMCI.
Brief tests such as the Mini-Mental State Examination (MMSE)  and the Montreal Cognitive Assessment (MoCA)  are popular screens for cognitive function. Neuropsychological assessments facilitate better understanding of cognitive performance for diagnosis but are time consuming, resource intensive, and suited for administration only by neuropsychologists—a resource that is often not readily available. Consequently, given the growing emphasis on early detection of cognitive impairment, there is a need for assessment tools that are intermediate between brief screening tests and neuropsychological batteries, can be administered by physicians as well as any health care professional or trained assistant in any medical setting, and can accurately identify mild cognitive decline. To accomplish this goal, the psychometric properties of the Behavioural Neurology Assessment , a screening test covering a broad spectrum of cognitive functions for diagnosing mild to moderate dementia, were significantly enhanced to detect mild cognitive deficits by development of the Toronto Cognitive Assessment (TorCA). This was done through the addition of more robust verbal learning and delayed recall, a complex figure copy with delayed recall, semantic knowledge items, a version of Trails A and B, and revision of the subset of language tests.
Our objectives were to obtain normative data on the TorCA and to validate this test for detection of amnestic mild cognitive impairment (aMCI). In addition to the paper version, we developed an electronic application for the iPad and assessed equivalency between the two versions. The advantages of an electronic application include automatic scoring, automatic point-of-care data collection for potential data entry into a clinical or research registry, a printable summary of results, and graphical representation of percentile performance on each cognitive domain.
The TorCA consists of 27 subtests within seven cognitive domains—Orientation, Immediate Recall, Delayed Recall, Delayed Recognition, Visuospatial Function, Working Memory/Attention/Executive Control, and Language (Table 1)—and can be administered by any health care professional or trained assistant and is suitable for use in any medical setting. Domain index scores represent addition of subtest scores within each domain. The Sum Index represents addition of all subtest scores.
There are 12 items included: year, month, day, date, season, place/building, floor, city, province, country, Prime Minister, and Premier of the province.
Immediate Verbal Recall
The CERAD 10-Word list  is presented over three trials.
Delayed Verbal and Visual Recall
Delayed recall of the CERAD Word List and the Benson Figure Copy  are assessed after at least 10 min.
Delayed Verbal and Visual Recognition
Recognition of whether words appeared in the CERAD list and which one of four complex figures was copied are assessed.
Working Memory/Attention/Executive Control
Working memory and attention are assessed by Digit Span and Serial Subtractions. Executive control  is assessed by drawing Alternating Sequences, Verbal Letter Fluency, and Trail Making A and B . A left–right reversed version of Trail Making is used to reduce practice effects on the standard version.
There are eight subtests included: Verbal Fluency (animal names), confrontation naming of 15 items from the Multilingual Naming Test (MINT) , Sentence Repetition, Sentence Comprehension, Single Word Reading and Comprehension (auditory and reading), and Semantic Knowledge.
TorCA Sum Index
Consistent with standard practice in neuropsychology, there is no upper limit on Verbal Fluency for “F” words and animals. Therefore, there is no maximum on the Sum Index.
Standardization and normative sample
The study was approved by the Research Ethics Board at Baycrest Health Sciences. Healthy volunteers (n = 303) were recruited from the Rotman Research Institute (RRI) registry. There were four age groups: 50–59, 60–69, 70–79, and 80–89 years. Exclusion criteria were history of neurological disease, drug abuse, head injury with loss of consciousness, attention deficit hyperactivity disorder, active psychiatric illness, or use of medication containing any opioid. Non-native English speakers were included if they could understand all instructions. For test items, and administration and scoring instructions, see the Toronto Dementia Research Alliance website (www.tdra.ca). Figure 1 shows a flow chart of the participants analyzed in the normative study.
To assess test stability, the TorCA was readministered to 29 participants after a median interval of 73 days (range 28–120) with mean difference, percentage score change, and stability coefficients (Pearson r) calculated between the first and second tests. Internal consistency was determined by calculating Cronbach’s α for domain and Sum Index scores from the normative data study.
Validation in aMCI
Participants over age 60 years, with differential diagnosis of normal cognition vs MCI, were referred from academic memory clinics across Toronto and London, Ontario, for clinical neuropsychological assessment. Although differential diagnosis at referral may not have added the descriptor “amnestic” to MCI, the final study sample was comprised only of participants with aMCI or normal cognition (NC) based on neuropsychological assessment. From 220 consecutive referrals from all sites, 25 refused clinical services, 7 were inappropriate, and 188 were assessed by a neuropsychologist. Of those assessed, 108 did not have MCI or NC and four met exclusion criteria, yielding 50 participants with aMCI (single domain/multiple domain = 13/37) and 26 with NC. Figure 2 shows a flow chart of the participants analyzed in the validation study.
As it proved difficult to find individuals with normal cognition in memory clinics, the remaining 31 normal participants were recruited from the current normative study sample and the community. The paper version of the TorCA was administered prior to neuropsychological assessment in all but three instances. The interval between neuropsychological assessment and TorCA was within six months.
As assessments were conducted in a clinical context, the neuropsychologists were aware of the TorCA scores and differential diagnoses. The majority of neuropsychological assessments were conducted by trained assistants not directly involved in the diagnostic process, although one of the neuropsychologists tested 42 participants. The TorCA was conducted by trained nurses, medical trainees, or research assistants who were blinded to the neuropsychological assessment results.
Exclusion criteria for the validation study were medical or neurological disorders that could cause cognitive deficits including untreated sleep apnea, traumatic brain injury with loss of consciousness greater than 30 min, history of stroke, attention deficit hyperactivity disorder requiring medication, substance abuse, or other significant psychiatric disorders.
The following were administered as part of the neuropsychological battery:
Kaplan–Baycrest Neurocognitive Assessment (KBNA) .
Trail Making Test Forms A and B .
Wechsler Adult Intelligence Scale—III (WAIS-III) Digit Symbol .
WAIS—III Digit Span .
Wechsler Memory Scale—Revised (WMS-R) Logical Memory I and II subtests (Story A or B) .
Wechsler Abbreviated Scale of Intelligence (WASI) Vocabulary (split half), Similarities, and Matrix Reasoning subtests .
Boston Naming Test (split half) .
Delis–Kaplan Executive Function System (D-KEFS) Color-Word Interference Test .
Multifactorial Metamemory Questionnaire—Memory Mistakes scale .
Lawton and Brody ADL questionnaire .
Hospital Anxiety and Depression Scale .
All participants with aMCI met published criteria . Objective memory impairment was defined as deficits on three of four memory tests relative to expectations based on age, education, and intellectual status. Memory tests were WMS-R Logical Memory, KBNA Word List , KBNA Complex Figure, and WAIS-III Digit Symbol incidental recall . Deficit was defined as 1.5 standard deviations below estimated IQ based on the two-subtest IQ estimate of the WASI. Memory deficits had to occur at encoding or retention stages. Isolated retrieval deficits were not sufficient for diagnosis of aMCI.
Concurrent validity was determined by the ability of the TorCA to discriminate between aMCI and NC participants. Construct validity was determined by correlations between TorCA subtests and neuropsychological tests in the aMCI and NC groups and by testing for expected group differences on TorCA indices and subtests.
Equivalency of paper vs electronic version
Forty-five normal participants were tested using paper and iPad versions and were divided into two groups. One group (n = 22, female/male = 17/5; mean (SD) age = 73.6 (7.7) years) was recruited from the normative sample and was administered the paper version first (test–retest interval M = 792.3 days, SD = 262.9). The second group (n = 23, female/male = 18/5; mean (SD) age = 70.6 (10.1) years) was recruited from the RRI registry and was administered the iPad version first (test–retest interval M = 257.2 days, SD = 67.2).
Table 2 presents participant profiles and normative data. Groups did not differ in years of education. There were significantly more females for the 50–59 year group (χ2(df = 1) = 4.26, p = 0.04), 60–69 year group (χ2(df = 1) = 14.14, p = 0.001), and 70–79 year group (χ2(df = 1) = 16.33, p = 0.001) but not for the 80–89 year group (χ2(df = 1) = 1.08, p = 0.30).
Normative TorCA test scores are categorized into ≤ 5th percentile (impaired), 6th–24th percentile (borderline), or ≥ 25th percentile (normal). Median time to complete the TorCA was 34 min (range 25–63). Tables 3, 4, 5, and 6 present normative data for individual subtests.
The Sum Index was significantly affected by age (F(3,299) = 6.45, p = 0.001) (Table 2). There was a significant but small effect size (Cohen’s d = 0.31)  for gender. Women scored a mean of 6.1 (SED = 2.2) points higher than men (F(1,301) = 7.36, p = 0.007). Age and education were weakly, but significantly, correlated with Sum Index (r = 0.24 and 0.23, both p < 0.001), each accounting for approximately 5% of the variance.
The results of the test–retest study using the paper version in normal participants are presented in Table 7. The scores remained remarkably stable across the retest intervals. Only the Memory—Immediate Recall (MIR), Memory—Delayed Recall (MDR), and Sum Index scores demonstrated significant increases and the increase in the latter was due to increase in the MIR and MDR indices. This indicates that there was a practice effect on the memory tests. Stability coefficients ranged from low (Orientation and Memory—Delayed Recognition, Visuospatial, and Working Memory/Attention/Executive Control Indices) to very good (Sum Index). The poor stability coefficients of Orientation and Memory—Delayed Recognition, Visuospatial, and Working Memory/Attention/Executive Control in large part are due to a restricted range of scores.
The intratest reliabilities of the TorCA indices are presented in Table 8. Reliability estimates ranged from low to good. The low coefficients of Orientation, Memory—Delayed Recognition, and Visuospatial Indices again are attributable to the restricted range of scores noted earlier. The Delayed Recall Index reliability coefficient was calculated by comparing the results of the Memory—Delayed Verbal Recall and the Memory—Delayed Visual Recall subtests and therefore did not represent a homogeneous construct. The Visuospatial Index reliability coefficient was calculated by comparing the results of the Benson Figure Copy and Clock Drawing subtests. Although both Benson Figure Copy and Clock Drawing measure visuospatial function, Clock Drawing is also a measure of planning, monitoring, and abstraction. Thus, these subtests are not homogeneous. Likewise, the Working Memory/Attention/Executive Control Index is not homogeneous in construct as it consists of measures of attention, working memory, conceptualization, and reasoning.
Validation in aMCI
Table 9 presents demographic features of the aMCI and NC groups. The groups did not differ in mean age, education, or Full-Scale IQ. The NC group had a higher proportion of females (67%) to males (33%) (χ2 = 6.33, p < 0.02), whereas the aMCI group had an approximately equal gender balance (54% male; 46% female).
Effect sizes based on difference between group means and standard deviations for neuropsychological tests used to determine group membership are provided in Fig. 3. There were significant effect sizes on verbal and visual learning (immediate recall of KBNA Word List and Complex Figure, WMS-R Logical Memory I), episodic memory (delayed recall and recognition of KBNA Word List and Complex Figure, WMS-R Logical Memory II), visual spatial working memory (KBNA Spatial Location), auditory working memory (WAIS-III Digit Span), attentional control (D-KEFS Color-Word Switching), visuospatial function (combined score for KBNA Complex Figure copy and Clock Drawing), semantic fluency (combined KBNA animal naming and first names), and cognitive flexibility (combined KBNA Practical Problem Solving and Conceptual Shifting). Overall, the aMCI group scored lower on neuropsychological testing but the largest effect sizes, in excess of 1.5 SD, were obtained on learning and episodic memory, thereby substantiating group classification as aMCI.
Table 9 presents between-group differences on TorCA indices. The aMCI group achieved a significantly lower TorCA Sum Index than did the NC group (F(1,105) = 36.86, p < 0.001). A MANOVA on the remaining seven domain indices revealed a significant effect for group (Wilk’s λ = 0.37, F(1,99) = 23.78, p < 0.001). Pairwise comparisons, with Bonferroni correction for seven multiple comparisons at p ≤ 0.05/7 (0.007), revealed significant differences for orientation, immediate memory recall, delayed memory recall, and delayed memory recognition indices.
Prior to analyzing TorCA subtest scores for group differences, boxplots for each subtest were inspected. Distribution of scores on Trail Making (completed trials measure, total correct minus incorrect lines), Alternating Sequences, Similarities, Sentence Repetition and Comprehension, Single Word Reading and Comprehension, and Semantic Knowledge showed a marked negative skew with a ceiling effect for both groups. Kolmogorov–Smirnov tests on these subtests revealed no differences in distribution of scores between the two groups. Therefore, these subtests were dropped from further between-group analyses.
Scores on Verbal Learning, Verbal Recall, Verbal Recognition, Visual Recall, Serial Subtractions, Digit Span, Trail Making A and B completed times measure, Benson Figure Copy, Clock Drawing, Verbal Fluency—F Words, Verbal Fluency—Animals, and MINT Naming were analyzed with a MANOVA for between-group differences (Table 10). There was a significant group effect (Wilk’s λ = 0.36, F(13,93), p < 0.001). Table 10 presents effect sizes for pairwise between-group comparisons for subtest scores. Large effect sizes, all in excess of 1.0, were obtained on memory tests including Verbal Learning, Delayed Verbal Recall, Delayed Verbal Recognition, and Delayed Visual Recall. There were moderate effect sizes on Trail Making B and Verbal Fluency—Animals. No significant between-group effects were found for Serial Subtractions, Trail Making A, Benson Figure Copy, Clock Drawing, Digit Span, Verbal Fluency—F Words, and MINT naming.
Concurrent validity with referenced neuropsychological tests
The TorCA Sum Index discriminated between the aMCI and NC groups (χ2 = 31.5, p < 0.0001, AUC = 0.84 (95% CI 0.75–0.92)). The sensitivity, specificity, likelihood ratio of a positive response (LRPR), likelihood ratio of a negative response (LRNR), positive predictive value (PPV), negative predictive value (NPV), Youden index, and correct classification of each Sum Index from 209 to 319 was calculated. The optimum cutoff value was determined by considering the maximum correct classification, LRPR, and Youden index combined with a view to minimizing false positives and maximizing classification accuracy. A Sum Index cutoff value of 275 was optimal and yielded an overall classification accuracy of 79% (95% CI 70–86%), sensitivity of 80% (95% CI 66–89%), specificity of 79% (95% CI 66–88%), LRPR of 3.80 (95% CI 2.26–6.40), and LRNR of 0.25 (95% CI 0.14–0.45). Given the aMCI prevalence of 47% in our sample, the 275 cutoff value yielded a PPV of 0.77 (95% CI 0.63–0.87) and NPV of 0.82 (95% CI 0.69–0.90). Agreement between the TorCA, using this cutoff value, and classification achieved by standard clinical and neuropsychological criteria was weak to moderate  (κ = 0.58 (95% CI 0.4–0.74)).
To explore which TorCA indices best discriminated between aMCI and NC, indices for Orientation, Immediate Memory Recall, Delayed Memory Recall, Delayed Memory Recognition, Visuospatial, Working Memory/Attention/Executive Control, and Language were entered into a backward, stepwise logistic regression that generates a posttest probability of aMCI (Table 11). Four indices (Immediate Memory Recall, Delayed Memory Recall, Visuospatial, and Working Memory/Attention/Executive Control) correctly classified 92% (95% CI 86–97%) of the aMCI and NC groups (AUC = 97% (95% CI 94–99%)). Optimal discrimination was obtained for aMCI probability of 0.55, yielding sensitivity of 92% (95% CI 85–99%), specificity of 91% (95% CI 84–99%), PPV of 0.90 (95% CI 0.82–0.98), and NPV of 0.93 (95% CI 0.86–0.99). This corresponds to LRPR of 10.49 (95% CI 4.52–23.52) and LRNR of 0.09 (95% CI 0.03–0.23); both LRPR and LRNR values can yield large changes in posttest disease likelihood and thereby increase test accuracy [22, 23]. The indices in the logistic regression formula yielded strong agreement  with clinical and neuropsychological classification for aMCI (κ = 0.83 (95% CI 0.74–0.92), χ2 = 74.0, p < 0.0001).
The neuropsychological tests were grouped into nine domains: Immediate Recall, Delayed Recall, Delayed Recognition, Visuospatial, Cognitive Flexibility, Attention/Concentration, Executive Control, Verbal Fluency, and Language. Correlations between TorCA and neuropsychological domains are presented in Table 12. The largest correlations were obtained between the three TorCA memory domains and the three neuropsychological test domains relating to memory. Small to medium-sized effects were found between the TorCA memory domains and neuropsychological test domains of Cognitive Flexibility, Attention/Concentration, and Language. Large effect sizes were obtained between the TorCA Working Memory/Attention/Executive Control domain and the neuropsychological Working Memory/Attention/Executive Control, Verbal Fluency, and Language domains. Medium effect sizes were noted with the Cognitive Flexibility and Attention/Concentration domains. The TorCA Working Memory/Attention/Executive Control domain was weakly associated with only the Immediate Recall domain. The Language domain was strongly associated with neuropsychological Language and Verbal Fluency domains, moderately associated with the Attention/Concentration, and Working Memory/Attention/Executive Control domains, and weakly associated with all three memory domains and Cognitive Flexibility. The TorCA Visuospatial domain showed a weak but significant correlation with the neuropsychological Visuospatial domain but no significant correlation with any other neuropsychological domain.
Equivalency of paper and iPad versions
There was a strong correlation between paper and iPad versions (r(43) = 0.86, p < 0.001) and no difference between TorCA Sum Index on paper (M = 299.9, SD = 18.1) and iPad (M = 300.7, SD = 18.4) versions (t(44) = − 0.56, p = 0.58). There was a trend (t(44) = 2.00, p = 0.052) for the mean Sum Index to be slightly lower on the first administration (M = 298.9, SD = 18.6) compared to the second (M = 301.7, SD = 17.9). Test–retest reliability between first and second administration was good (r(43) = 0.87, p < 0.001). There was no association between test–retest interval and change in Sum Index on first and second testing (r(43) = 0.04, p = 0.77). In addition to lack of a linear relationship between the change in Sum Index and test–retest interval, neither a quadratic (p = 0.90) nor a logarithmic model (p = 0.66) fit the data. In addition, the mean TorCA Sum Index did not differ for the group that took the paper version first (M = 294.7, SD = 16.8) compared to the group that took the iPad version first (M = 303.0, SD = 19.6) (t(43) = 1.5, p = 0.14).
The TorCA was administered to 303 healthy volunteers between ages 50 and 89 years, yielding a relatively brief assessment of multiple cognitive domains with median administration time of 34 min. Test–retest results remained relatively stable over a median of 73 days (range 28–120) with mean increase of only 3.3 points. Age and education accounted for only 5% of the variance in total score. Although age-adjusted norms are available for each decade from 50 to 89 years, the TorCA can be administered across this range with minimal need for age correction. Paper and iPad version scores were not significantly different. The iPad version provides easier administration with near automation of scoring and graphical representation of percentile scores (Fig. 4).
Overall stability was good with only modest increase in the Sum Index on retesting. Stability coefficients were low for Orientation, Delayed Recognition, Visuospatial Function, and Working Memory/Attention/Executive Control due to the restricted range of scores. Nevertheless, these scores demonstrated a very small percentage change in scores. The change in the Sum Index (1.1%) reflected increases in the immediate and delayed memory indices (14.3% and 10.7% respectively) with no other index exceeding an increase of 1.5% (Language).
Internal consistency of the Sum Index was adequate and reflected the heterogeneous nature of individual tests. Low internal consistency reflected the diverse nature of cognitive abilities on Delayed Recall and Working Memory/Attention/Executive Control. The former combines verbal and visual memory, whereas the latter combines heterogeneous measures related to frontal system function. Low internal consistency also reflected restricted range in scores on Orientation, Delayed Recognition, and Visuospatial Function.
We validated the TorCA for detection of aMCI based on a need for cognitive assessment tools that can identify early decline, that are much shorter than typical neuropsychological batteries, and that can be administered by any health professional or trained assistant. A combination of TorCA subscores yielded correct classification, sensitivity, and specificity of over 90%. Logistic regression revealed that scores in four domains—Immediate Recall, Delayed Verbal and Visual Recall, Visuospatial Function, and Working Memory/Attention/Executive Control—correctly classified 92% of participants, and yielded an easily applied formula to calculate the probability of aMCI (www.tdra.ca). This is automatically calculated with the iPad version of the TorCA. It should be emphasized that the correct classification of 92% arises from four domains of the TorCA rather than the total score on the entire test. In contrast, correct classification was 79% based on the Sum Index (total score).
Although the logistic regression probability of 0.55 for aMCI is the optimal cutoff value, this may not always represent the best decision value for determining positive or negative cases. If sensitivity and specificity are held constant, PPV decreases as pretest disease probability (prevalence) decreases and increases as pretest probability increases. Conversely, NPV increases with decrease in pretest probability and decreases as pretest probability increases. PPVs and NPVs listed earlier for the optimal value relate only to the pretest probability of aMCI in our sample (50/107 = 0.47). Table 13 presents the range of PPV and NPV values for a cutoff value of 0.55 for pretest probabilities ranging from 0.05 to 0.90. PPVs and NPVs for a cutoff value of 0.90 are also provided. If a logistic regression value of 0.55 or higher is obtained for individuals with pretest probability of 0.20, then 72% will be correctly classified as aMCI. However, 28% will be misclassified, which is unacceptable. At the same level of pretest probability, a logistic regression value less than 0.55 results in correctly ruling out aMCI in 98% of negative cases. At a pretest probability of 0.20, raising the “rule-in” predicted value to 0.90 results in 88% of positive cases being true aMCI with only 12% false positives. A level of 0.20 was chosen in these examples because this is approximately the estimated prevalence of aMCI in community samples .
Based on the validation data for TorCA Sum Index reported in this article, the TorCA is comparable to published data on the MoCA for detection of MCI. A meta-analysis of 20 studies conducted by Ciesielska et al.  reported that a MoCA cutoff value of 25/30 correctly yielded a sensitivity of 80% and specificity of 81%. A meta-analysis of nine studies  evaluating the MoCA’s ability to discriminate aMCI from normal controls found that a cutoff value of 23/30 yielded a correct classification of 86% (95% CI 83–90%) with a sensitivity of 83% (95% CI 76–89%) and specificity of 88% (95% CI 84–92%), while the original cutoff value of 26/30, as suggested by Nasreddine et al. , yielded correct classification of only 78% (95% CI 75–82%) with sensitivity of 94% (95% CI 91–97%) and specificity of 66% (95% CI 60–71%). This compares to correct classification of 79% for the TorCA with a sensitivity and specificity of 80% and 79% using the Sum Index. The TorCA is also comparable to the Addenbrooke’s Cognitive Examination (ACE-R and ACE III) based on published data [27, 28]. Ahmed et al.  reported that the ACE-R correctly classified 74% (95% CI 56–87%) of MCI and normal controls with a sensitivity of 90% (95% CI 58–98%) and specificity of 67% (95% CI 41–84%). Matias-Guiu et al.  reported that the ACE-III correctly classified 75% (95% CI 66–82%) of MCI and normal controls with a sensitivity of 77% (95% CI 62–87%) and specificity of 75% (95% CI 62–83%). Although confidence intervals were not provided in the reports by Ahmed et al. and Matias-Guiu et al. [27, 28], we calculated them for comparison to our data.
The TorCA has potential resource allocation implications in centers with neuropsychology resources by identifying patients who do not require neuropsychological assessment due to a high probability of aMCI or because this disorder is effectively ruled out. Although the logistic regression was exploratory, a reasonable strategy might be to rule out aMCI if probability, based on the logistic regression formula, is below 0.55. Due to the likelihood that the logistic regression formula overestimates classification , we recommend a value of 0.90 or higher to rule in aMCI. For values between 0.55 and 0.90, referral should preferably be made for neuropsychological assessment to confirm diagnosis. In the absence of available neuropsychology resources, these patients should be followed to establish diagnosis.
Study limitations should be acknowledged. First is the need for cross-validation. Whereas the validation study revealed that the use of the logistic regression formula would refine the identification of aMCI, this represents an initial, exploratory result and further cross-validation of the formula is needed to confirm critical values and stability of constituent indices. A second limitation is that the logistic regression formula for probability of aMCI applies only to differential diagnosis of aMCI vs normal aging. Future studies are needed to validate the TorCA for differentiating aMCI from other cognitive disorders, and to determine whether it performs equally well for identifying single vs multiple domain aMCI. A third limitation is that participants in the validation study had relatively high IQs. Studies are needed to determine validity of the TorCA for diagnosing aMCI in participants with lower IQs. In addition, a caution is that interpretation of positive or negative cases must take into account differences between patients’ estimated pretest probabilities of a condition and prevalence of the condition in validation studies. A fourth limitation is that the orientation items consisting of Prime Minister, Premier, and season are country specific. This will be addressed in future by translating the TorCA into languages other than English and carrying out normative and validation studies using the translated tests. Ideally, normative and validation studies should also be carried out in English-speaking countries other than Canada. Finally, this study focused only on aMCI from a diagnostic perspective. Future studies will be needed to validate the TorCA for diagnosis of other forms of mild cognitive decline. It is likely that the discriminating indices on the TorCA will differ from those that predict aMCI.
The TorCA is a relatively short cognitive assessment tool for identification of early cognitive decline and can be administered by any health care professional or assistant with appropriate training. It also has the potential to save both time and physical resources by identifying patients who may not require neuropsychological assessments for diagnosing aMCI. Future studies will focus on cross-validating the TorCA for aMCI and validating this test for disorders other than aMCI.
Amnestic mild cognitive impairment
Kaplan–Baycrest Neurocognitive Assessment
Likelihood ratio of a negative response
Likelihood ratio of a positive response
Multilingual Naming Test
Montreal Cognitive Assessment
Negative predictive value
Positive predictive value
Rotman Research Institute
Toronto Cognitive Assessment
Wechsler Adult Intelligence Scale—III
Wechsler Abbreviated Scale of Intelligence
Wechsler Memory Scale—Revised
Folstein MF, Folstein SE, McHugh PR. ‘Mini-mental State’. A practical method for grading the cognitive state of patients for the clinician. J Psychiatr Res. 1975;12:189–98. https://doi.org/10.1016/0022-3956(75)90026-6.
Nasreddine ZS, Phillips NA, Bédirian V, Charbonneau S, Whitehead V, Collin I, et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc. 2005;53:695–9. https://doi.org/10.1111/j.1532-5415.2005.53221.x.
Darvesh S, Leach L, Black SE, Kaplan E, Freedman M. The Behavioural Neurology Assessment. Can J Neurol Sci. 2005;32:167–77. https://doi.org/10.1017/S0317167100003930.
Morris JC, Mohs RC, Rogers H, Fillenbaum G, Heyman A. Consortium to Establish a Registry for Alzheimer’s Disease (CERAD) clinical and neuropsychological assessment of Alzheimer's disease. Psychopharmacol Bull. 1988;24:641–52.
Possin KL, Laluz VR, Alcantar OZ, Miller BL, Kramer JH. Distinct neuroanatomical substrates and cognitive mechanisms of figure copy performance in Alzheimer's disease and behavioral variant frontotemporal dementia. Neuropsychologia. 2011;49:43–8. https://doi.org/10.1016/j.neuropsychologia.2010.10.026.
Freedman M, Leach L, Kaplan E, Winocur G, Shulman KI, Delis DC. Clock Drawing: A Neuropsychological Analysis. New York, New York: Oxford University Press; 1994.
Henri-Bhargava A, Stuss DT, Freedman M. Function and dysfunction of the prefrontal lobes in neurodegenerative diseases. In: Gediminas PE, editor. Progressive Cognitive Impairment and its Neuropathologic Correlates. New York: Nova Science Publishers Inc; 2016. p. 51–68.
Army Individual Test Battery: Manual of Directions and Scoring. Washington, DC: War Department, Adjutant General’s Office; 1944.
Gollan TH, Weissberger GH, Runnqvist E, Montoya RI, Cera CM. Self-ratings of spoken language dominance: a multilingual naming test (MINT) and preliminary norms for young and aging Spanish-English bilinguals. Biling Lang Cogn. 2012;15:594–615. https://doi.org/10.1017/S1366728911000332.
Leach L, Kaplan E, Rewilak D, Richards B, Proulx GB. Kaplan Baycrest Neurocognitive Assessment Manual. San Antonio, TX: The Psychological Corporation; 2000.
Wechsler D. Wechsler Adult Intelligence Scale—Third Edition. San Antonio, TX: The Psychological Corporation; 1997.
Wechsler D. Wechsler Memory Scale—Revised. San Antonio, TX: The Psychological Corporation; 1987.
Wechsler D. Wechsler Abbreviated Scale of Intelligence. San Antonio, TX: The Psychological Corporation; 1999.
Kaplan E, Goodglass H, Weintraub S. The Boston Naming Test. Philadelphia, PA: Lea & Febiger; 1983.
Delis DC, Kaplan E, Kramer JH. The Delis-Kaplan Executive Function System (D-KEFS). San Antonio, TX: The Psychological Corporation; 2001.
Troyer AK, Rich JB. Psychometric properties of a new metamemory questionnaire for older adults. J Gerontol B Psychol. 2002;57:19–27. https://doi.org/10.1093/geronb/57.1.P19.
Lawton MP, Brody EM. Assessment of older people: self-maintaining and instrumental activities of daily living. Gerontologist. 1969;9:179–86. https://doi.org/10.1093/geront/9.3_Part_1.179.
Snaith RP. The Hospital Anxiety and Depression Scale. Health Qual Life Out. 2003;1:29. https://doi.org/10.1186/1477-7525-1-29.
Albert MS, DeKosky ST, Dickson D, Dubois B, Feldman HH, Fox NC, et al. The diagnosis of mild cognitive impairment due to Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7:270–9. https://doi.org/10.1016/j.jalz.2011.03.008.
Cohen J. Statistical power analysis for the behavioral sciences. 2nd ed. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988.
McHugh ML. Interrater reliability: the kappa statistic. Biochem Medica. 2012;22:276–82. http://dx.doi.org/10.11613/BM.2012.031.
Jaeschke R, Guyatt GH, Sackett DL. Users’ guides to the medical literature. III. How to use an article about a diagnostic test. B. What are the results and will they help me in caring for my patients? The Evidence-Based Medicine Working Group. JAMA. 1994;271:703–7.
Hawkins RC. The Evidence Based Medicine approach to diagnostic testing: practicalities and limitations. Clin Biochem Rev. 2005;26:7–18.
Petersen RC, Roberts RO, Knopman DS, Boeve BF, Geda YE, Ivnik RJ, et al. Mild cognitive impairment: ten years later. Arch Neurol. 2009;66:1447–55. https://doi.org/10.1001/archneurol.2009.266.
Ciesielska N, Sokolowski R, Mazur E, Podhorecka M, Polak-Szabela A, Kedziora-Kornatowska K. Is the Montreal Cognitive Assessment (MoCA) test better suited than the Mini-Mental State Examination (MMSE) in mild cognitive impairment (MCI) detection among people aged over 60? Meta-analysis. Psychiatr Pol. 2016;50:1039–52. https://dx.doi.org/10.12740/PP/45368.
Carson N, Leach L, Murphy KJ. A re-examination of Montreal Cognitive Assessment (MoCA) cutoff scores. Int J Geriatr Psychiatry. 2018;33:379–88. https://doi.org/10.1002/gps.4756.
Ahmed S, de Jager C, Wilcock G. A comparison of screening tools for the assessment of mild cognitive impairment: preliminary findings. Neurocase. 2012;18:336–51. https://dx.doi.org/10.1080/13554794.2011.608365.
Matias-Guiu JA, Cortes-Martinez A, Valles-Salgado M, Rognoni T, Fernandez-Matarrubia M, Moreno-Ramos T, et al. Addenbrooke's cognitive examination III: diagnostic utility for mild cognitive impairment and dementia and correlation with standardized neuropsychological tests. Int Psychogeriatr. 2017;29:105–13. https://dx.doi.org/10.1017/S1041610216001496.
Steyerberg EW, Eijkemans MJ, Habbema JD. Stepwise selection in small data sets: a simulation study of bias in logistic regression analysis. J Clin Epidemiol. 1999;52:935-42. https://doi.org/10.1016/S0895-4356(99)00103-1.
The authors would like to acknowledge S. Wilson for comprehension and semantic knowledge tasks, T. Gollan for the abbreviated naming task (MINT), A. Hillis for the repetition task, K. Possin and J. Kramer for the Benson Figure, the Frontotemporal Dementia (FTLD) Workgroup (Chair, D. Knopman) of the National Alzheimer’s Coordinating Center (NACC) (Director, W.A. Kukull, Grant Number U01 AG016976) for use of the repetition task and the Benson Figure from the NACC FTLD Module, and G.G. Fillenbaum for CERAD use and materials.
This work was supported in part by the Toronto Dementia Research Alliance Partner Institutions (Baycrest Health Sciences, Centre for Addiction and Mental Health, St. Michael’s Hospital, Sunnybrook Health Sciences Centre, University Health Network, and Faculty of Medicine, University of Toronto); Department of Medicine Alternative Funding Plan and Division of Neurology Innovation Fund, University of Toronto; Edwards Family Foundation; and Ontario Neurodegenerative Disease Research Initiative (ONDRI) funded by Ontario Brain Institute (OBI). MF receives support from the Saul A. Silverman Family Foundation as a Canada International Scientific Exchange Program and Morris Kerzner Memorial Fund. SEB receives support from the Brill Chair in Neurology, University of Toronto and Sunnybrook Foundation, the Hurvitz Brain Sciences Research Program, Sunnybrook Research Institute, and the Department of Medicine, Sunnybrook Health Sciences Centre. SD holds the Dalhousie Medical Research Foundation Irene MacDonald Sobey Endowed Chair in Curative Approaches to Alzheimer’s Disease. GN receives support from the George, Margaret and Gary Hunt Family Chair in Geriatric Medicine, University of Toronto. SCS, RS, and TG received partial grant support from CIHR MOP 201403, the Ontario Brain Institute, and Brain Canada. The funding sources had no role in the study design, in the collection, analysis and interpretation of data, in writing of the report, and in the decision to submit data for publication.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
Tom Gee is now at Indoc Research, Toronto, ON, Canada. Barry D. Greenberg is now at Department of Neurology, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
Ethics approval and consent to participate
The study was approved by the Research Ethics Board at Baycrest Health Sciences. Written informed consent was obtained from all participants.
MF received financial support for a Behavioural Neurology fellow from Eli Lilly Canada, served on an advisory board for Eli Lilly Canada, receives royalties for a book on Clock Drawing from Oxford University Press, is listed on a provisional patent related to methods and kits for differential diagnosis of Alzheimer’s disease vs frontotemporal dementia using blood biomarkers, and may be listed on the planned patent application, and serves on the editorial board of Brain and Cognition. LL receives royalties from Pearson Assessment on sales of the Kaplan Baycrest Neurocognitive Assessment (KBNA). NH received research support from Axovant, Lundbeck and Roche, and consultation fees from Merck, Lilly, Mediti and Astellas. SEB reports institutional grants from Pfizer, GE Healthcare, Eli Lilly, Roche, Cognoptix, Biogen, and Novartis and personal honoraria from Pfizer, Eli Lilly, Boehringer Ingelheim, Novartis, Merck, and Medscape (Biogen Idec); SEB also reports salary support from Sunnybrook Research Institute, Brill Chair, Department of Medicine, Sunnybrook Health Sciences Centre. SCS is Chief Science Officer of ADMdx, LLC. MCT, KAS, YG, RS, NN, TG, MOA, MB, SD, AF, CEF, JF, BDG, MG, RK, JK, SK, BL, SL, MPM, GN, RP, TKR, WR, MUW, NPLGV, JLW, and DFT-W do not have any competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Freedman, M., Leach, L., Carmela Tartaglia, M. et al. The Toronto Cognitive Assessment (TorCA): normative data and validation to detect amnestic mild cognitive impairment. Alz Res Therapy 10, 65 (2018) doi:10.1186/s13195-018-0382-y
- Toronto Cognitive Assessment
- Mild cognitive impairment
- Cognitive assessment
- Normative study