Alzheimer's disease therapeutic research: the path forward

The field of Alzheimer's disease therapeutic research seems poised to bring to clinic the next generation of treatments, moving beyond symptomatic benefits to modification of the underlying neurobiology of the disease. But a series of recent trials has had disappointingly negative results that raise questions about our drug development strategies. Consideration of ongoing programs demonstrates difficult pitfalls. But a clear path forward is emerging. Successful strategies will utilize newly available tools to reconsider issues of diagnosis, assessment and analysis, facilitating the study of new treatments at early stages in the disease process at which they are most likely to yield major clinical benefits.

Alzheimer's disease (AD) was described just over 100 years ago as an uncommon devastating dementia affecting people in middle age. In the 1970s, Dr Robert Katzman demonstrated that AD is in fact an epidemic of enormous proportions, affecting a substantial segment of the aging population [1]. This spurred basic and clinical therapeutic research activity, leading to the development of modestly effective symptomatic treatments. While efforts to improve cognitive and behavioral symptoms continue, the major focus of AD therapeutic research is now disease modification -that is, slowing the progression of the underlying neurobiology of AD [2]. Alois Alzheimer described neuronal loss with formation of plaques and tangles. Today's leading programs target the biochemical pathways leading to amyloid accumulation and neurofibrillary tangle formation, and aim to protect neuronal cells and synapses against dysfunction and destruction.
Clear targets have been identified. Two enzymes, beta secretase and the gamma secretase complex, appear to be essential for cleavage of the amyloidogenic Aβ fragment from its transmembrane amyloid precursor protein (APP); inhibition of one or both is expected to reduce amyloid accumulation [3]. Genetic evidence provides strong support for these approaches: all known genetic causes of AD either increase the expression of APP or increase the generation of amyloidogenic fragments. There is also hope that inhibiting receptors that mediate Aβ trafficking [4,5] and toxicity [5,6] may modify AD neurodegeneration. Tangle-related targets, including kinase inhibitors aiming to reduce the hyperphosphorylation that characterizes the abnormal tau protein in tangles [7], have seen more limited efforts. Neurotrophic programs include direct neurosurgical delivery of nerve growth factor to the nucleus basalis [8] using a viral vector.
But despite the proliferation of clinical development programs, early results have been quite disappointing. The first two antiamyloid drugs to reach the pivotal stage of development, tramiprosate and tarenflurbil, failed in phase III. What are the implications of these failures? Are the targets wrong? Can the field afford to invest the huge efforts and funds necessary to continue to test potential disease-modifying treatments? Is there any realistic likelihood of success?

Tramiprosate
Tramiprosate (also referred to as homotaurine and 3-amino-1propanesulfonic acid, or 3APS) is an Aβ-binding compound that was developed using in vitro and in vivo model systems [9] that left some uncertainty regarding the brain concentration necessary for a pharmacodynamic effect in human AD. While a phase II study did suggest a reduction in cerebrospinal fluid Aβ in AD subjects treated with tramiprosate [10], it was unknown whether the degree of reduction would be sufficient to translate into clinical benefit. The small and brief phase II program was not designed to demonstrate clinically a disease-slowing effect; as expected, subjects in the 12 week treatment trial treated with placebo showed no decline, and, therefore, there was no possibility of showing reduced decline with treatment. The development of tramiprosate as a pharmaceutical treatment for AD was halted when the first phase III trial failed to demonstrate significant beneficial effects on the primary analysis of cognitive and clinical outcomes.
(page number not for citation purposes) Tarenflurbil Similar problems were faced in the tarenflurbil development program. In vitro and in vivo, the drug clearly modulates gamma secretase activity, reducing generation of Aβ [11,12]. In early human studies, however, there was no strong biomarker evidence of amyloid reduction in cerebrospinal fluid (CSF), and concerns about inadequate brain concentrations in humans were voiced. The phase II trial, though relatively large, was underpowered to show a disease-modifying effect, and the primary analysis of the impact of treatment on cognitive and functional outcomes was negative [13]. However, post hoc analyses appeared to be consistent with a beneficial drug effect, leading to the launch of large phase III trials. The program was terminated when the first phase III trial showed no evidence of beneficial effect.

Bapineuzumab
A somewhat similar situation has arisen in the development program of bapineuzumab, a monoclonal amino terminusspecific anti-amyloid antibody [14]. Perhaps misled by encouraging cognitive data from a small phase I trial hinting at a symptomatic effect, the sponsors sought evidence of efficacy in the modestly sized phase II program. Though there was evidence of benefit in a number of secondary analyses, particularly in the apolipoprotein E ε4 negative subgroup, the primary cognitive efficacy analysis was negative. The sponsors, Elan and Wyeth, have nonetheless proceeded with a very large phase III program.
Why have so many programs yielded discouraging efficacy data in phase II and III clinical trials? In phase II, the problem may be primarily one of statistical power. Most programs seek to be able to demonstrate a 25% to 33% slowing of progression of mild or mild to moderate AD. But in view of substantial inter-subject and inter-site variance issues, a large and long trial is necessary. Estimates using preliminary data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) [15] suggest that to demonstrate a 33% reduction in progression rate (as measured by change in Alzheimer's Disease Assessment Scale-cognitive subscale (ADAS-cog) or Clinical Dementia Rating 'sum of boxes' (CDR-SB)) in an 18 month trial in mild AD, approximately 300 subjects per group are required (M Donohue et al., unpublished). No phase II program has approached this size. It should be noted that the ADNI experience probably underestimates the sample size required, in that it may be expected that variance will be greater in commercial trials, particularly those that are international, than among the academic North American sites participating in ADNI.

Other recent trials: Rember and dimebon
As with the anti-amyloid agents tramiprosate, tarenflurbil and bapineuzumab, the phase II trial of the anti-tangle compound Rember (methylene blue) did not meet its primary efficacy objectives [16]. In view of the modest group sizes and short duration, this too is not surprising, regardless of whether the drug ultimately proves effective in pivotal trials. But as with the anti-amyloid programs, caution must be exercised in the interpretation of post hoc analyses of the Rember trial data.
The development of dimebon represents the one recent AD program with strikingly positive results. At the primary 6 month analysis, strongly significant beneficial effects were seen on all outcome measures [17]. With continuation of the blind through 12 months of treatment, the effects on outcome measures increased, consistent with (though not definitive evidence of) a disease-modifying effect [17]. The success of this modestly sized trial is indicative of the immediate symptomatic benefit associated with the treatment. If a putative disease modifying drug yields short-term benefits, short (6 month) trials may be sufficient for regulatory approval. Symptomatic effects are plausible with neuroprotective and anti-amyloid drugs. But in the absence of such effects, a modest, phase II-type trial will be insufficient; little or no cognitive and clinical decline can be observed in 6 months, so no slowing of progression can be demonstrated. A consensus has arisen that 18 months or longer is an appropriate duration of treatment for studies aiming to show slowing of decline in AD.
But what about the two negative phase III trials of plausible anti-amyloid agents? There was certainly a 'phase II problem'that is, phase III proceeded without evidence of efficacy in phase II. As expected, the small phase II tramiprosate study did not show any efficacy signal, but the modest reduction in CSF Aβ42 was considered encouraging. But it is unknown what the size of this biomarker signal must be to predict clinical efficacy with prolonged treatment. Further, there were questions about the magnitude and consistency of central nervous system drug penetration and concentration. But in addition to these uncertainties, the power of the phase III North American tramiprosate trial was lower than expected. The placebo group decline was smaller, and the standard deviation of the change score was higher, than expected; the power to demonstrate the target effect size of 25% slowing with the group sizes of 350 was limited. The tarenflurbil phase III program followed a phase II study that (not surprisingly) failed to achieve its primary efficacy objectives, so the risk of a negative phase III program had to be considered substantial; only the post hoc phase II analyses were encouraging. In addition, there was no convincing evidence of pharmacodynamic effect; specifically, no reduction in CSF Aβ in humans had been demonstrated. The negative trial results may reflect inadequate brain penetration in humans to yield a sufficient reduction in the generation of Aβ.
On the basis of these plausible explanations, the negative results of the phase II and III anti-amyloid trials cannot be considered to be strong evidence against the amyloid cascade hypothesis. The scientific basis for the hypothesis remains quite compelling. Aβ42 is highly toxic to neuronal cells and synaptic function, particularly in its oligomeric states. Each of the known genetic causes of AD is closely linked to Aβ generation: Down syndrome to APP overexpression, and familial autosomal dominant AD to mutations of APP and presinilins 1 and 2 that increase amyloidogenic cleavage of APP. Occam's Razor points strongly to amyloid as the pivotal molecule. Recent reports of a small number of AD patients with progressive dementia despite apparent amyloid plaque clearance resulting from active vaccination [18] does not disprove the hypothesis; clinical data on these individuals are limited, plaques are probably not the most important form of Aβ, the course of disease had these patients not been treated is unknown, and perhaps earlier treatment is necessary for a profound effect on outcome.

The need for early intervention
This last point may be key. There is strong evidence that the pathiobiology of AD precedes dementia by many years. In Down syndrome, amyloid deposition in brain precedes dementia by years or decades [19,20]. There is a high prevalence of brain amyloid in non-demented elderly individuals at autopsy [21], perhaps an indication of a long pre-symptomatic stage. Similarly, neuroimaging evidence of brain amyloid deposition is common [22]. Subtle memory symptoms and cognitive decline have been documented more than a decade before dementia onset [23]. If, as suggested by the recent active vaccine study autopsy report, dementia can progress despite elimination of amyloid plaques in AD patients [18], perhaps it is necessary to intervene earlier in the disease process.
At present, the diagnosis of AD requires the presence of dementia. What is the relationship of pre-dementia cognitive dysfunction to AD? What is the significance of amyloid brain deposition in the absence of cognitive impairment? With two plausible (if not yet proven) methods for identifying brain amyloid deposition, positron emission tomography (PET) scanning and CSF measurement of Aβ42, identification of such individuals is quite feasible. If either subtle cognitive impairment or amyloid deposition in brain consistently predicts AD dementia, it should be considered an early stage of AD. That is, we should revise the standard NINCDS-ADRDA criteria for AD [24].
Dubois and colleagues [25] have proposed one possible revision. They suggest that a 'research diagnosis' of AD be based on the presence of gradually progressive episodic memory impairment with evidence of AD neurobiology documented by the presence of one or more among several characteristic biomarker signals. The biomarker signals include medial temporal lobe atrophy by volumetric magnetic resonance imaging (MRI), temporal parietal hypoperfusion by [18F]fluorodeoxyglucose (FDG)-PET, amyloid deposition by PET, or CSF findings (elevated tau or phospho-tau, and/or low Aβ42) characteristic of AD. The proposed criteria can be applied in the pre-dementia or dementia stages.
It is plausible that effective disease-modifying interventions might be only minimally effective or even futile at the dementia stage; neuroprotection or favorable effects on the inciting amyloid dysregulation might be overwhelmed by extensive neuronal/synaptic degeneration and plaque pathology. Extending the diagnosis of AD to include individuals with mild cognitive impairment and even normal cognition when there is biomarker evidence of AD-type pathophysiology might facilitate the development of disease-modifying drugs for the treatment of individuals most likely to respond. The earlier the disease-modifying intervention, the greater the expected impact on the disease course. This idea is supported by a number of therapeutic studies in transgenic animal models [26,27].

The design of early intervention trials
Altering the definition of AD may not be necessary for the development of early interventions. A possible development strategy for early intervention using the current diagnostic criteria would be to conduct studies aiming to demonstrate that an intervention increases the time to dementia diagnosis. Several completed studies have enrolled subjects with amnestic mild cognitive impairment (MCI), with the primary analysis of survival to consensus diagnosis of AD [28]. This design has the advantage of clear clinical validity, a desirable feature in consideration of the uncertain regulatory status of the MCI designation. At least one set of MCI criteria seems to predict a high likelihood of AD diagnosis (approximately 15% per year), so that such a trial can have a reasonable size with adequate power to demonstrate a treatment effect [29]. But progression from MCI to AD is not a discrete event; the loss of function necessary to meet criteria for dementia occurs gradually, and it is challenging to assign a specific date to dementia onset. This subjectivity may be aggravated in large international trials. The progression of cognitive and functional impairment caused by AD pathobiology is insidious; defining a discrete disease onset seems arbitrary.
If the diagnostic criteria for AD are modified to encompass individuals prior to the onset of dementia, it would be straightforward to design trials with standard AD co-primary outcome measures. The ADNI longitudinal data demonstrate acceptable decline rate and variance for the ADAS-cog and CDR-SB in amnestic MCI; the size of trials adequately powered to demonstrate slowing of cognitive/clinical progression would be large but perhaps manageable. Adding selection criteria, and perhaps covariates to adjust for disease state, will reduce sample sizes substantially. In particular, for the development of anti-amyloid programs such as secretase inhibitor, anti-aggregation agents and anti-amyloid immunotherapy, trials can select MCI patients with biomarker evidence of brain amyloid deposition. Two options are feasible, though each presents challenges. The advent of F18 amyloid binding radiotracers has established the feasibility of amyloid brain imaging at most sites with PET scanners, but this is an expensive undertaking that has not yet been fully validated. CSF Aβ42 is strongly associated with neuroimaging evidence of amyloid deposition [30] and is essentially universally available, though requiring lumbar puncture during study screening may not be welcomed by investigators and especially subjects. The addition of covariates such as MRI volumetric measures to analysis plans will reduce unexplained variance and further increase statistical power to demonstrate slowing of progression. If the community of AD investigators, clinicians and regulators were to adopt early AD diagnostic criteria, feasible early AD trials could be launched immediately (Table 1).
While disease-modifying treatment of MCI is expected to yield more dramatic benefits than treatment of mild AD, perhaps the most appropriate population for intervention is the earlier, pre-symptomatic (or very mildly symptomatic) subjects with biomarker evidence suggestive of AD. The ultimate goal of disease-modification programs, prevention of AD dementia, is conceivable if treatment is started before appreciable neuronal damage and synaptic dysfunction have occurred. Preliminary evidence from some studies suggest that markers of amyloid accumulation predict dementia even in asymptomatic individuals [31].

Validated surrogate markers in AD
But in the absence of symptoms, it will not be possible to fulfill the conventional US Food and Drug Administration requirement for AD drug development: demonstration of efficacy on co-primary measures, specifically a cognitive performance test and a functional/global measure. It would require huge and lengthy studies to show slowing of cognitive and clinical progression or delay to diagnosis of dementia in subjects not yet showing any symptoms. To study interventions in this population, we will require validated surrogate markers.
A biomarker is any objectively measured characteristic that reflects normal or pathological processes, or responses to therapeutic intervention. As discussed above, biomarkers can be valuable in selecting subjects for clinical trials and for therapeutic interventions, for reducing unexplained variance and thus improving statistical power, and for establishing proof of concept in early phase drug development.
In rare cases, a biomarker can take the place of a clinical endpoint for establishing efficacy in a phase III clinical trial; that is, a biomarker can be validated as a surrogate endpoint. Examples of such surrogate markers include blood glucose and hgA1c in diabetes, blood pressure and cholesterol in cardiovascular disease, intraocular pressure in glaucoma, and lymphocyte subset ratios and viral load in HIV disease. To validate a biomarker as a surrogate endpoint, several issues must be addressed. There must be a well-accepted scientific framework connecting the biomarker to disease mechanisms and the prediction of clinical outcomes. Further, drug effects on the biomarker must be related to drug effects on clinical

Analysis covariates
Baseline cognition and regional Baseline cognition and regional Regional brain volume brain volume brain volume Biomarker outcome Regional brain atrophy Regional brain atrophy Regional brain atrophy and/or amyloid measure (as surrogate endpoint) Duration of treatment 18 months 24 months 24 to 36 months Primary analysis Change score or slope of Change score or slope of Regional brain atrophy rate and co-primaries: ADAS-cog11, co-primaries: outcome; ideally, the biomarker should fully capture treatment effects, as confirmed by clinical trials of multiple interventions.
It is unlikely that an ideal surrogate for disease-modifying intervention in AD will become available in the foreseeable future. However, in consideration of the enormous clinical need, and the likelihood that the development of highly effective disease-modifying treatments will require the use of surrogate endpoints, it is reasonable to assume that regulatory agencies will consider acceptance of surrogates that are less than ideal.
A validated surrogate marker is essential for the study of AD interventions in asymptomatic or very mildly symptomatic individuals. It may be feasible to gain acceptance of a surrogate AD biomarker with a small number of trials demonstrating concordant treatment effects on the biomarkers and clinical symptoms. Even if the benefits of disease-modifying treatments in mild AD dementia are limited, they may well be sufficient to establish this concordance. Indeed, ongoing antiamyloid trials that have incorporated biomarkers could provide this evidence. Consensus among clinical experts, based on robust data, that candidate biomarkers track disease progression at various stages of disease will strengthen the case for validation. A leading candidate surrogate marker is brain atrophy rate as measured by volumetric MRI; a huge body of evidence supports a link between regional brain atrophy and progression of AD pathobiology [32][33][34].

Paving the path forward
Building the consensus necessary to shift regulatory guidelines, clinical trial design and clinical practice will require large-scale cooperation among pharmaceutical and biotech companies, academic leaders, advocacy groups, funders and regulators. It is fortuitous that such cooperative efforts have been steadily gaining traction in recent years. Regular meetings involving all of the stakeholder groups have been productive; the semiannual Alzheimer's Association Research Roundtable, the annual Leon Thal Symposium sponsored by the Lou Ruvo Brain Institute, and the meetings of the Task Force on Use of Biomarkers in Alzheimer's Trials are leading examples that have advanced the field. They demonstrate the eagerness of many companies to share experience and ideas in pursuit of solutions to problems in AD therapeutic research.
Perhaps the best example of a cooperative effort among many to overcome the hurdles in drug development is ADNI. Led by Michael Weiner at the University of California San Francisco, ADNI is jointly funded by the National Institute on Aging, the Alzheimer's Association and other foundations, and contributions from pharmaceutical companies. It is a long-term effort to collect longitudinal cognitive, clinical, CSF and neuroimaging data on cohorts of individuals with mild AD, mild cognitive impairment and normal cognitive aging to allow optimal use of biomarkers in trial design. ADNI brings together leaders from academia, industry, government agencies and advocacy groups on at least a biweekly basis to jointly assess the study progress, and to discuss roadblocks and paths forward. To maximize scientific advance, all ADNI data are publicly posted in real-time; a huge number of presentations and publications from ADNI as well as outside investigators bears evidence of its success. ADNI has also spawned or supported similar collaborative efforts in Europe, Japan, Australia and China.
Data and ideas arising from ADNI and the various collaborative meetings have provided the ideas and data behind the discussion in this article. With continuation of these efforts, the common goal of optimal trial design is readily achievable. The challenges of determining populations for study, cognitive and clinical outcome measures, validation of biomarkers and analytic plans can be met within a few years. Consensus will lead to practical regulatory pathways, and the successful introduction of disease-modifying interventions that will blunt the AD epidemic that is growing with the aging world populations.