Effects of pharmacological and nonpharmacological treatments on brain functional magnetic resonance imaging in Alzheimer’s disease and mild cognitive impairment: a critical review

Background A growing number of pharmacological and nonpharmacological trials have been performed to test the efficacy of approved or experimental treatments in Alzheimer disease (AD) and mild cognitive impairment (MCI). In this context, functional magnetic resonance imaging (fMRI) may be a good candidate to detect brain changes after a short period of treatment. Main body This critical review aimed to identify and discuss the available studies that have tested the efficacy of pharmacological and nonpharmacological treatments in AD and MCI cases using task-based or resting-state fMRI measures as primary outcomes. A PubMed-based literature search was performed with the use of the three macro-areas: ‘disease’, ‘type of MRI’, and ‘type of treatment’. Each contribution was individually reviewed according to the Cochrane Collaboration’s tool for assessing risk of bias. Study limitations were systematically detected and critically discussed. We selected 34 pharmacological and 13 nonpharmacological articles. According to the Cochrane Collaboration’s tool for assessing risk of bias, 40% of these studies were randomized but only a few described clearly the randomization procedure, 36% declared the blindness of participants and personnel, and only 21% reported the blindness of outcome assessment. In addition, 28% of the studies presented more than 20% drop-outs at short- and/or long-term assessments. Additional common shortcomings of the reviewed works were related to study design, patient selection, sample size, choice of outcome measures, management of drop-out cases, and fMRI methods. Conclusion There is an urgent need to obtain efficient treatments for AD and MCI. fMRI is powerful enough to detect even subtle changes over a short period of treatment; however, the soundness of methods should be improved to enable meaningful data interpretation.


Background
Alzheimer's disease (AD) is a devastating neurodegenerative disease and the most prevalent form of dementia [1]. There is an urgent need to identify effective treatments that may improve cognitive function in subjects with manifest or prodromal AD, and in people at risk of developing the disease, such as those with mild cognitive impairment (MCI). Currently there are two classes of drugs approved for the treatment of AD: the cholinesterase inhibitors, which are licensed for the treatment of mild-to-moderate AD, and memantine for moderate-to-severe disease stages [2]. These treatments have been demonstrated to be able to slow down the course of the disease but they cannot modify progression nor prevent onset [2]. Although no new therapeutics have been approved for AD in over 10 years, a substantial number of compounds thought to reduce amyloid and/or tau deposition are currently being testing [2]. The growing social emergency represented by AD and the lack of medical treatments able to modify the disease course have kindled interest in nonpharmacological therapies, such as cognitive stimulation, aerobic physical exercise, music therapy, and diet, with the aim of optimizing cognitive and functional skills and improving patient quality of life [3].
Numerous clinical trials have been performed to explore the efficacy of pharmacological and nonpharmacological treatments on cognitive and/or behavioral symptoms in AD and MCI patients. In clinical trials, outcome measures are typically performance-based instruments or structured surveys of clinician/caregiver impression of change [4]. Although the efficacy of treatments for AD and MCI must ultimately be demonstrated using clinically meaningful outcome measures, such trials will likely require hundreds of patients studied for medium term periods [5]. Thus, surrogate markers of efficacy with less variability than clinical assessments are needed to reduce the number of subjects. These markers may also be particularly valuable in the early phase of drug development to detect a preliminary "signal of efficacy" over a shorter time period.
Given the growing body of evidence that alterations in synaptic function are present very early in the course of the neurodegenerative disease process [6,7], functional magnetic resonance imaging (fMRI) has been shown to be particularly useful for detecting early alterations in brain function and may be a critical marker for the detection of physiological changes over a short interval [8]. Specifically, fMRI may be valuable in evaluating acute and subacute effects of therapeutic interventions by showing how they modulate targeted circuits [9]. Using fMRI, the efficacy of treatments on brain function can be revealed by task-based or task-free (resting-state) approaches. By modeling cognitive paradigms, task-based fMRI explores cerebral functioning while the subject is performing specific activities that can mimic the actual difficulties occurring in daily life. A number of pioneering task-based fMRI studies have identified reduced activation in hippocampal and parahippocampal regions during episodic memory tasks in patients with AD [10][11][12][13] and, less consistently, both medial temporal lobe decreased and increased activation in patients with MCI [11,12,[14][15][16][17][18]. In addition, resting-state fMRI has the potential to detect subtle functional abnormalities in brain networks supporting complex cognitive processes that are progressively impaired over the course of AD. At present, several studies of AD patients have demonstrated alterations of the default mode network (DMN) and other resting-state networks related to cognitive functions [19][20][21]. Compared to task-based approaches, resting-state imaging has the advantage of avoiding performance-related variability and is also less complicated to acquire and standardize [22].
The aim of this manuscript is to review studies that have tested pharmacological or nonpharmacological treatments in AD and MCI patients by using task-based or restingstate fMRI measures as primary outcomes. Furthermore, from a critical point of view, we explore the factors that could act as bias while verifying the efficacy of a treatment. Finally, we offer practical suggestions that could be useful in future studies.

Formal literature review research
A formal literature review was conducted on Medline in two separate sections, one for pharmacological and the other for nonpharmacological studies. In all cases, the research was performed on relevant articles (and their references) published in peer-reviewed journals before 20 March 2017 and with the use of three macro-areas, such as 'disease' , 'type of MRI' , and 'type of treatment'. The disease has been searched with the single term 'mild cognitive impairment' or 'MCI' in the title and abstract only; or with the Mesh term ' Alzheimer's disease' or with the same single term in the title and abstract only. The type of MRI was searched with the single terms 'functional MRI' or 'fMRI' or 'functional connectivity'.

Pharmacological studies
The type of treatment was searched with the Mesh term 'Therapeutics' or the single terms 'treatment' or 'pharmacological treatment'. The final search line was the following:

Nonpharmacological studies
The type of treatment was searched with the Mesh term 'Physical Therapy Modalities' or 'Exercise Therapy' or the single terms 'physical therapy' or 'motor rehabilitation' or 'physical training' or 'physical therapy' or 'exercise training' or 'physical exercise' or 'cognitive exercise' or 'cognitive rehabilitation'. The final search line was the following: (((((("Alzheimer Disease"[Mesh]) OR alzheimer's disease[Title/Abstract]) OR MCI[Title/Abstract]) OR mild cognitive impairment[Title/Abstract])) AND (((functional mri) OR fmri) OR functional connectivity)) AND ((((((((((("Exercise Therapy"[Mesh]) OR "Physical Therapy Modalities"[Mesh]) OR physical exercise) OR exercise training) OR physical therapy) OR physical training) OR motor rehabilitation) OR cognitive exercise) OR cognitive rehabilitation) OR cognitive training) OR cognitive stimulation).

Critical review
Each original contribution was individually reviewed according to the Cochrane Collaboration's tool for assessing risk of bias [23]. This tool provides criteria for judging the risk of bias in experimental designs testing the efficacy of treatments [23]. Each selected article was independently judged by two reviewers (EC and ES) according to seven categories: 1) random sequence generation; 2) allocation concealment; 3) blinding of participants and personnel; 4) blinding of outcome assessment; 5) short-term incomplete outcome data; 6) long-term incomplete outcome data; 7) and selective reporting [23]. The assessment was achieved by assigning a judgment of 'low risk' of bias when bias was absent or considered unlikely to have altered the results, 'high risk' of bias when the potential for bias weakened confidence in the results, and 'unclear risk' when there was some doubt about the effect of bias on the results due to insufficient information. When no agreement was reached between the two reviewers, the specific article was further discussed with a third reviewer (FA) for a final judgment. Further technical biases were identified by the reviewers according to their expertise in neuroimaging, neurology, neuropsychology, and physiotherapy fields and were discussed in appropriate sessions.

Pharmacological studies
We obtained 1506 articles. Through title and/or abstract reading, we excluded review articles, articles that did not directly look at the treatment effect on fMRI measures, animal model studies, and articles written in non-English languages. We included 34 pharmacological studies ( Fig. 1 and Table 1). Twelve studies were on MCI patients, 21 on AD patients (16 on mild AD, 4 on mild-to-moderate AD, 1 on moderate AD), and one included both mild AD and MCI cases. Twelve studies were randomized controlled trials while the others had a nonrandomized or an observational design.

Summary
As expected, the effect of acetyl-cholinesterase inhibitors (AchEI) has been investigated in the majority of studies (82%), followed by levetiracetam (6%), memantine (3%), caffeine (3%), and Chinese medicines such as the Compound congrongyizhi and the Bushen capsules (6%). In general, treatments lasted from a day (acute) to 6 months. Only in one study did the authors observe the effect of the proposed treatment over 24 months. The adopted fMRI approach was: task-based fMRI in 74% of studies, using memory (44%, such as encoding, retrieval, recognition and/or matching tasks), visual attention (3%), visuospatial or spatial navigation (6%), N-back (18%), or semantic association paradigms (3%); restingstate fMRI in 23% of studies; and both resting-state fMRI and visual encoding paradigms in the remaining 3%. fMRI studies showed positive effects of cognitive enhancing drugs on brain activation during cognitive task performance or the resting state in patients with AD and MCI. Both acute and prolonged exposure to pharmacological therapies were associated with fMRI changes in AD-specific and non-AD regions. In the majority of the studies, these changes were in parallel with improved fMRI task performance and global cognition assessed with a formal neuropsychological assessment outside the scanner. However, due to the heterogeneity of pharmacological treatment, dosage, and cognitive paradigms used for fMRI tasks, a generalization of the results is challenging.
In mild AD, a single dose (3 mg) of rivastigmine [24,25] or infusion of physostigmine [26,27] compared to placebo were associated with a greater activation of the right precuneus and parahippocampal gyrus [26], bilateral fusiform cortex [25,27], and prefrontal areas [24] during face-recognition memory paradigms, which correlated with improved task performance [24,27]. Using a similar paradigm in mild-moderate AD, increased right fusiform gyrus was observed after 10 weeks of donepezil [28]. During a task assessing the auditory process of verbal memory in mild AD, the activity was increased in mild AD patients in the left temporal cortex, parahippocampal gyrus, and frontoparietal executive network, together with an increase of successfully retrieved trials after 6 weeks of donepezil [29,30]. During a facerecognition task, both increased activation after acute (8 mg) and decreased activation after prolonged (5 days) galantamine exposure were observed in parahippocampal regions in mild AD [31]. In mild AD patients, 3 months of treatment with galantamine reduced the fMRI signal within the dorsal pathway during a locationmatching test [32]. Most studies which investigated the effect of prolonged treatment exposure showed that mild AD patients "normalized" the fMRI activity to the level of controls at baseline in AD-crucial regions after about 20 weeks of donepezil [33], rivastigmine [34], and galantamine [35] treatments, in parallel with improved global cognition and task performance [33,34]. However, not all studies found a correlation between fMRI changes and clinical improvement, e.g., McGeown et al. demonstrated a widespread pattern of decreased fMRI activity during semantic association and working memory tasks after 20 weeks of donepezil but higher accuracy in task performance was associated with increased recruitment in nontask-relevant regions [36]. Finally, fMRI changes were observed to be greater in AchEI "responders" [37].
In MCI patients, increased fMRI activity in hippocampus and parahippocampal regions were observed during a spatial navigation task after only 7 days of galantamine treatment [38] as well as during face encoding after 6 days exposure to the same therapy [39]. A stabilization of fMRI hippocampal activity (decreased to the level of healthy controls) during a memory recognition task was found after 2 weeks at low doses of levetiracetam, with parallel improvement in patient memory performance [40,41]. During a face-recognition task, increased activation after acute (8 mg) and decreased activation after prolonged (5 days) galantamine exposure were observed in posterior cingulate cortex (PCC), superior parietal regions, and frontal cortex in MCI patients [31]. In MCI patients, better task performance, enhanced functional connectivity between the hippocampus and the fusiform face area during a face recognition fMRI task [42], and enhanced connectivity between the hippocampus and frontal and striatal regions during a verbal episodic encoding task [43] were observed after 3 months of treatment with donepezil. Increased inferior frontal fMRI activity was observed during face retrieval after 3 to 6 months of the same treatment [44]. Using working memory and location matching task paradigms in MCI patients, acute administration of caffeine [45], about 10 weeks of treatment with donepezil [46], 3 to 6 months of treatment with rivastigmine [47], and 3 months exposure to Compound Congrongyizhi Capsule [48] enhanced the functional activity in the frontoparietal pathway, with improved patient accuracy during the tasks [46]. Several resting-state fMRI studies reported increased functional connectivity after pharmacological treatments in mild-to-moderate AD patients. Increased connectivity was observed in the DMN [49], between the hippocampus and several cortical and subcortical regions [50], and between the PCC and prefrontal and parietal brain regions [51] after 3 months of donepezil, in parallel with an improvement in global cognitive scores [49][50][51]. In addition, increased resting-state connectivity was observed after 3 to 4 months of donepezil in non-DMN orbitofrontal [52] and dorsolateral prefrontal networks [53]. This effect was observed to be greater in apolipoprotein E ε4 carriers and Fig. 2 Judgments of articles according to the seven categories of the Cochrane Collaboration's tool for assessing risk of bias. Positive marks denote low risk or no bias; negative marks denote high-risk bias; question marks denote unclear information. NA not applicable to be present regardless of the kind of AchEI administered [54]. In mild-moderate AD, increased resting-state functional connectivity was also observed in the posterior and hippocampal DMN components after 12 months of galantamine [55] and in moderate-severe AD after 6 months of memantine [56]. Importantly, one study showed that 3 months of treatment with donepezil in mild AD cases was also associated with "restored"/stabilized hippocampal connectivity (i.e., decreased negative correlations) with cortical regions in the parietal, temporal, and frontal cortices [50]. In MCI patients, resting-state connectivity increased in the right precuneus within the DMN with parallel improvement in verbal and working memory after 24 months of treatment with Bushen capsules [57].
Although several studies showed both clinical and fMRI changes after pharmacological therapies ( Table 1), none of them directly compared clinical and fMRI effect sizes in order to define the most powerful marker to monitor treatment efficacy.

Critical review
According to the Cochrane Collaboration's tool for assessing risk of bias, 12 studies (35%) were randomized but only one described clearly the randomization procedure and the allocation. Twelve studies (35%) declared the blindness of participants and personnel; for two studies (6%) this information was unclear, and the other 20 reports (59%) were unblinded. Five studies (15%) declared the blindness of outcome assessment; for 12 studies (35%) this information was unclear, and the other 17 (50%) were unblinded. Eleven studies (32%) presented more than 20% drop-outs at short-and/or long-term assessments leading to 'high risk' bias due to incomplete outcome data. All studies appropriately reported the primary and the secondary outcome measures of the investigation. A report of the final judgments for each selected article is shown in Fig. 2.

Nonpharmacological studies
We obtained 777 articles and we excluded articles due to the same reasons reported above for the pharmacological studies. Two further articles were manually identified through the reference lists of the selected manuscripts. We included 13 nonpharmacological studies ( Fig. 3 and Table 2), with 10 studies on MCI patients (five on cognitive-rehabilitation, three on physical rehabilitation, and two combined) and three on AD patients (two on cognitive-rehabilitation and one on combined cognitive-physical training-one on mild AD and two on mild-to-moderate AD). Seven studies were randomized controlled trials while the others had a nonrandomized or an observational design.

Summary
Studies on cognitive rehabilitation proposed different types of training such as verbal and visual encoding, retrieval and mnemonic association strategies, auditoryverbal discrimination, mindfulness, singing therapy, reality orientation exercises, and occupational/recreational therapy. Studies investigating the effects of physical therapy were based on aerobic and progressive resistance training. While physical training lasted usually about 3 months, the cognitive and combined approaches presented greater duration variability (from 2 weeks to 7 months). Overall, both MCI and AD patients took advantage from cognitive training while only MCI patients seemed to benefit from physical therapy. The adopted fMRI approaches were resting-state fMRI (23%), or task-based fMRI (77%) using memory paradigms such as encoding, retrieval, association, and discrimination tasks (54%), visuo-spatial attention (8%), and verbal paradigms (15%). Due to the intensity of the programs and/or the difficulty of the proposed fMRI tasks, most of these studies focused on MCI rather than AD patients. A summary of findings is difficult due to the heterogeneity of training and task selection. However, it emerges that cognitive, physical, or combined training are mainly associated with enhanced brain activity or connectivity in trained patients with concomitant improvement in specific cognitive functions. The effects of cognitive rehabilitation have been assessed with fMRI tasks in the majority of studies. After 2 months of training on strategies for acquiring new information, mild AD patients showed an increased activity in the frontoparietal areas and insula during an unfamiliar face-name association task [58]. Using singing training for 6 months, an improvement on daily living activities, behavior, and reasoning in mild-moderate AD patients, together with fMRI increased activation of the angular and lingual gyri during a Karaoke task, were observed [59]. In MCI patients, after an intense program of encoding/retrieval memory training, increased recruitment of frontotemporal areas, basal ganglia, and cerebellum was observed during a memory-encoding task [60], and of frontal, parietal, temporal, and occipital areas [61] and left hippocampus [62] during memoryassociation tasks. During the memory retrieval phase, in trained MCI patients, a specific relationship between the increased activity of the right inferior parietal lobule and the improved performance on verbal delayed recall was found [60]. After a 2-month computer-based program on auditory verbal discrimination, MCI patients showed increased activity in the left hippocampus during an auditory verbal task with a parallel improvement in memory performance as tested outside the scanner [63]. MCI patients showed an increased resting-state functional connectivity between the PCC and bilateral medial prefrontal cortex and between the PCC and left hippocampus after eight sessions of mindfulness-based stress reduction [64].
The effects of aerobic training have been assessed in MCI patients with both task-based and resting-state fMRI. After 3 months of moderate aerobic exercises, no specific effects on brain activations were observed using a semantic memory task [65], while an increased resting-state functional connectivity between the PCC and bilateral frontoparietal and temporal cortices, insula and cerebellum was observed in MCI cases [66]. An increased activity of the left caudate after regular high-intensity physical activity compared to low-intensity training was observed using a famous-name discrimination paradigm [67].
The efficacy of a combined (cognitive and physical) approach was investigated in three studies, which adopted multidimensional stimulation programs. In the first study, mild-moderate AD patients were involved in 30 training sessions [68]. After training, during a verbal fluency task, AD patients showed an increased recruitment of the bilateral superior temporal gyrus, right insula, and thalamus associated with improvement in global cognition [68]. In a second study, after a 7-month training, 113 MCI patients showed no specific training-related brain changes during a visuospatial attention task [69]. Finally, one study investigated the effect of 26 weeks of progressive resistance training and computerized cognitive training (CCT) in 100 MCI patients using resting-state fMRI [70]. Both trainings, as well as the combination of the two, were associated with changes in functional connectivity between the hippocampus, PCC, and frontotemporal regions [70]. Of note, increased connectivity between the hippocampus and left superior frontal cortex after CCT was associated with improved memory performance [70].
No study directly compared clinical and fMRI effect sizes in order to define the most powerful marker to monitor treatment efficacy.

Critical review
According to the Cochrane Collaboration's tool for assessing risk of bias, seven studies (54%) were randomized but only four described clearly the randomization procedure and none the allocation. Five studies (38%) stated the blindness of participants and personnel; for two studies (15%) this information was unclear, and the other 6 (47%) were unblinded. Five studies (38%) reported the blindness of outcome assessment; for three studies (24%) this information was unclear, and the other five (38%) were unblinded. Two studies (15%) presented more than 20% drop-outs at short-and/or long-term assessments. All studies but one appropriately reported the primary and the secondary outcome measures of the investigation. A report of the final judgments for each selected article is shown in Fig. 2.
Common shortcomings of the reviewed works were regarding study design, patient selection, sample size, choice of outcome measures, management of drop-out cases, and fMRI methods.
In the following discussion, we underline the strengths and limitations of the reviewed studies and provide suggestions to overcome these issues.

Patient selection, randomization, and allocation
The definition of the clinical population is a very critical point. Targets of the proposed treatments should be cases of prodromal or probable AD with a clinical diagnosis supported by biomarkers [71]. Over the last decades, the development of subject-selection strategies that strongly maximize the power of treatments by detecting target populations has been an important focus of large international studies such as the Alzheimer's Disease Neuroimaging Initiative [71]. Abnormal tau and amyloid β42 cerebrospinal fluid levels, baseline MRI atrophy, and apolipoprotein E ε4 status have been used as successful stratification strategies [72] and should be applied to define an early clinical population, such as MCI, or atrisk asymptomatic subjects. However, only a few of the reviewed studies [24,26,27,29,30] used biomarkers in the inclusion process and, for some others, the clinical features of the MCI population (if it was amnesic for instance) were also unclear. While selecting the study sample, the lack of a neat clinical definition together with the absence of biomarkers leads to underpowered and diluted findings.
In most of the reviewed articles, the randomization procedure was not performed due to the observational nature of the study design and to the absence of a group of placebo or active healthy controls. Although these studies observed an effect of the proposed treatments on the outcome measures, the authors cannot argue for a specific efficacy of the treatment itself since it could be due to the mere nature of the clinical intervention. The absence of a control condition also leads to the unblinding of participants and personnel; this is an additional confounding factor that affects the soundness of methods. On the other hand, many works, which declare to have adopted a randomized study design, failed to clearly describe the procedure of the subject randomization and allocation or introduced some a priori bias (such as a priori stratification of the sample by gender [68] or the decision of a disproportionate ratio of the group distribution [47]) that may affect the neutral distribution of subjects in the experimental groups.
We have the following suggestions: 1) the population should be well-defined clinically and the AD diagnosis should be biomarker-supported; and 2) randomization and allocation must follow recognized guidelines and should be clearly reported in the study description.

Type, intensity, and duration of treatment
The persistence of effects, along with generalization of gain in everyday life, is the critical point of pharmacological and nonpharmacological therapies. The need of a long-term treatment to maintain positive effects engenders the problem of the treatment costs. It is noteworthy that the selection of the type, intensity, and duration of treatment has the potential to modulate its efficacy. For instance, studies comparing the clinical and fMRI effects of pharmacological treatments directly targeting synapses versus other types of therapies (e.g., inhibitors of cholinesterase enzymes) are lacking. In the case of nonpharmacological interventions, the long-term potential of the combination of cognitive and motor rehabilitation has been amply postulated in neurodegenerative disorders [73]; however, only two reviewed studies [68,70] adopted this combined approach demonstrating its effect on cognitive and behavioral improvement even after 22 weeks [68]. The success of this last-mentioned study is also attributable to the nature of the proposed training, which involved both patients and caregivers thus guarantying a continuous care at home [68]. Furthermore, the different efficacy based on intensity of training has been poorly considered. This is important since in other conditions, such as in Parkinson's disease, training on alternate days has been demonstrated to be more efficient compared to an intense (everyday) approach [74].
We have the following suggestions: 1) the selection of the type, intensity and duration of treatment is relevant and can modulate the long-term effect of intervention; 2) studies comparing the clinical and fMRI effects of pharmacological treatments directly targeting synapses versus other types of therapies are needed; and 3) in nonpharmacological interventions, studies aimed at assessing the efficacy of the cognitive and motor training combination as well as at establishing the optimal intensity of treatment are warranted.

The choice of outcome measures
The main difficulty for these studies is to transfer outcome measures from the laboratory to real life. fMRI can contribute to this effort by identifying, through the task or using a resting-state approach, the brain regions or brain networks that are sensitive to treatment and that can predict the everyday activities for which treatment is likely to be effective.
However, building the proper fMRI task is challenging. First, cognitive fMRI experiments used to test behavioral longitudinal changes can be biased by learning effects, especially when the interval between pre-and posttreatment evaluation is short. The use of parallel versions of the same task avoids the detection of an improvement due to learning. In the majority of pharmacological studies, mainly the observational ones, the selected task is disease-driven, i.e., it has the aim to test the drug efficacy on cognitive domains known to be affected in AD such as episodic or semantic memory (encoding, recall, recognition, pair-association), visuo-spatial abilities, and auditory working memory. In the same way, when a resting-state approach is preferred, functional connectivity within the DMN, as the most affected network in AD, is usually the primary MRI outcome. Although this approach is understandable and driven by what we know about the AD pathology, it runs the risk of losing some important information on the treatment efficacy. With such diseasedriven methods, mechanisms of compensation and brain reorganization in unaffected brain areas could not be captured. For this purpose, Dhanjal and Wise investigated the effect of cholinesterase inhibitors on non-DMN networks, such as salience and executive-control networks, in a group of AD patients to determine whether improving memory function via modulation of frontoparietal connectivity was a possible compensative mechanism [30]. The same strategy can be adopted by task-related fMRI designs, by observing if the activity of nonmemory brain circuits, such as those subtending selective attention and/ or distracter inhibition, could modulate the improvement of the encoding processing and the successful recall.
In nonpharmacological studies, the selected task is usually training-driven, i.e., it is built to verify improvement in activity in brain regions known to subtend the trainingrelated functions. For instance, in the Explicit-Memory Training proposed by Hampstead and colleagues [61], patients acquired mnemonic strategies using face-name associations and the fMRI task used the same paradigm to test its efficacy. However, there are some studies using generic fMRI tasks (such as verbal fluency) as well as clinical outcome measures assessing global cognitive status which are not specific and/or unrelated to the performed training. The risk in these latter cases is to observe changes in fMRI activity unrelated to the training.
Finally, no study to date has directly compared clinical/cognitive versus fMRI outcome effect sizes (only the relationship between these variables has been assessed) in order to define which marker is the most powerful in reflecting treatment effects over time.
We have the following suggestions: 1) parallel versions of the same fMRI task are needed to avoid learning effects; 2) a whole brain fMRI investigation is necessary to have a complete understanding on the effect of treatment in the whole brain; 3) training-driven tasks rather than global and unspecific tests are suggested as outcome measures in nonpharmacological studies; and 4) clinical/cognitive versus fMRI effect size comparisons should be provided.
Incomplete outcomes, drop-out cases, and sample size Incomplete outcome measures are often an important problem in these studies. The reasons for incomplete data or drop-outs are often related to the treatment itself (sideeffects), but they could also be associated to the MRI environment (claustrophobia or difficulties lying down in the scanner during the entire duration of the protocol), technical MRI issues (motion artifacts or unrecorded behavioral performances during the task), patient difficulties in understanding and/or maintaining the task instructions, progression of the disease, changes in motivation, and lack of compliance. In aging and cognitive-impaired populations, cases of drop-out are frequent and should be considered during the recruitment phase by involving larger initial samples. In fact, if not considered, the consequences on the research protocol can be severe resulting in a reduction in the study power. For instance, Bokde and colleagues [47] enrolled 12 MCI patients in their trial and randomly assigned them to treated and placebo groups with a 2:3 ratio, respectively. Due to several drop-out cases, the placebo group finally included only two subjects and the analysis within this group was not statistically feasible [47]. Furthermore, negative findings are questionable in cases of a small sample size; for example, McGeown and colleagues [36] who reported no efficacy of 20 weeks of treatment with donepezil in a group of 12 AD patients on task-related fMRI activity and on behavioral performances.
By using a semi-cylindrical panel covering the patient's body from the head to the knees (simulating the limited space in the scanner) together with a loud white noise through headphones (mimicking the noise of the scanner), Lorenzi and colleagues [56] performed a 9-min fMRI scan simulation during patient screening. This simple system tested the patient's ability to rest, without moving, in an 'unusual' environment for the entire scan acquisition, thereby ensuring patient comfort and data quality. This simulation was useful for testing the patient tolerability to the MRI noise and environment, and for detecting the presence of claustrophobia and other behavioral complaints, such as agitation and anxiety, not identified during the interview with the caregiver but triggered during this 'unusual' situation. After the MRI simulation, 12 out of 28 moderate-to-severe AD patients did not pass the screening while the remaining all but one were successfully acquired and completed the study [56]. In addition, for some patients, task instructions could be difficult to understand and/or maintained during the sequence. Cognitive difficulties are likely to affect patient behavioral performances during the acquisition, and the fMRI signal could reflect a pattern unrelated to the investigated domain. A bias mitigation action could be to train the patient for several sessions prior to the MRI scan in order to assess task instruction comprehension and maintaining.
Finally, patient and caregiver motivation are also crucial for the success of the clinical trials. In the study of Baglio and colleagues [68], patients and caregivers underwent a multidimensional stimulation group therapy, which included 30 training sessions for the patient and an educational program to the caregiver to favor a long-term positive interaction with patients at home. The involvement of the caregivers was highly motivating with more than 80% of the initially recruited population still being part of the study at the 32-week clinical follow-up. However, in the same study, the fMRI part was apparently less 'appealing' since only 55% of the initial sample concluded the follow-up at week 10.
We have the following suggestions: 1) the statistical power of the study must be estimated, and larger samples should be recruited accounting for the attrition rate-multicenter collaborations could be an option to mitigate this issue; 2) results should be validated and tested using independent data; 3) simulations of MRI examination should be included in the patient screening phase for detecting cases of claustrophobia, behavioral complaints, or difficulties in lying down in the scanner; and 4) caregivers should get involved as much as possible in the study to increase patient compliance.

Some MRI technical issues
Longitudinal MRI studies require monitoring of MRI data stability over time. The same MRI scanner should be used for all subjects for the entire duration of the study. The reproducibility of fMRI signal changes in young and old healthy individuals and in cognitively impaired subjects during memory tasks and resting state fMRI is only modest [75][76][77][78]. Thus, the MRI signal should be verified using pre-and postreproducibility studies. In this review, we noticed that only a few studies proposed two pretraining MRI scan sessions [29,30,60]. This is a key method for distinguishing brain changes related to repetition (a mere test-retest effect) from those associated with treatment or training. Unfortunately, the same studies [29,30,60] did not include control conditions, thus the test-retest study did not help to understand whether brain changes were specific to the treatment or training. Although a direct comparison between task-based and resting-state fMRI reproducibility has not been tested in any of the reviewed studies, the literature suggests that restingstate fMRI is more advantageous to provide reproducible patterns of fMRI connectivity over time and across scanner platforms since no special equipment is required and individuals do not have to be able to perform a cognitive task [79].
AD and MCI patients are known to have brain atrophy. However, only a few studies investigated cortical atrophy [27,33,45,52], and only one study accounted for gray matter volume into the second-level fMRI analysis [55]. Partial volume effects can lead to a wrong interpretation of greater fMRI intensity in voxels with smaller proportions of gray matter with the risk of affecting group comparisons [80].
We have the following suggestions: 1) the same MRI scanner should be used for the entire duration of the study, and the stability of the MRI signal should be verified using pre/postreproducibility studies; and 2) second level analyses should take into account gray matter density at the voxel level.

Conclusions
This critical review pointed at both strengths and caveats of the existing literature on the effects of pharmacological and nonpharmacological treatments on brain fMRI in AD and MCI. In general, although both task-based and resting-state fMRI have been valuable in detecting even subtle changes over a short period of treatment, current knowledge does not allow us to support fMRI as a suitable candidate outcome measure. Although a large amount of work has been done so far, there is an urgent need to increase the number and ameliorate the reliability of the studies by improving the soundness of the methods. We underline the importance of sample size and patient selection for increasing the statistical power, the need for validation and testing (using independent data), the appropriateness of the study design, and the ecological value of the interventions to increase the likelihood of transferability into daily life, and whole brain investigation in order to capture both pathological and compensatory mechanisms. Finally, existing literature suggests we care about the motivation of patients and caregivers in order to avoid drop-outs during the follow-up. Future larger studies with improved design will allow us to perform a meta-analysis, which is the best approach for providing conclusive information on fMRI as a relevant outcome measure.