The major objective of this multi-center phase 2, randomized, double-blind placebo-controlled 24-week trial was to determine whether p38α inhibition could reverse synaptic dysfunction and, with it restore function, at least partially, as assessed by episodic memory performance. Toward that objective, in the primary endpoint, we found no difference between neflamapimod treatment and placebo in changes in episodic memory performance, as measured with the HVLT-R. However, the CSF biomarker results provide evidence suggesting that neflamapimod treatment had intended biological effects. In particular, significantly decreased levels of the disease biomarkers CSF T-tau and CSF p-tau181 were observed with neflamapimod, compared with placebo, over the 24-week treatment period in the full efficacy population. For both T-tau and p-tau181, there was a mean ~3% increase in CSF levels in the placebo group, as expected, while there was a decrease of similar magnitude in the neflamapimod group. Tau phosphorylation and tau pathology have been identified in preclinical studies as downstream consequences of abnormally high p38α kinase activity [10, 16, 20]. Further, neflamapimod treatment in the Ts2 transgenic mouse model decreased, relative to vehicle, levels of p-tau (pS202) in the brain (Nixon RA, personal communication; presentation at Alzheimer’s Disease Genetics Global Symposium, 22 September 2020). Thus, as the reduction in tau phosphorylation is a consequence of p38α inhibition, the decreases in p-tau and tau in CSF in the current study demonstrate, at a minimum, target engagement for neflamapimod with respect to p38α inhibition. In addition, p-tau, tau, and neurogranin are the protein markers generally considered to be most closely associated with synaptic dysfunction in AD [34,35,36,37,38,39,40], suggesting that neflamapimod has a beneficial effect on synaptic dysfunction.
Results of the pre-specified PK–PD analyses suggest that a major factor in not meeting the primary clinical objective, despite significant effects of neflamapimod treatment on CSF biomarker levels, is that the dose of 40 mg twice daily was too low. Specifically, for the primary endpoint, HVLT-R, there were trends towards improvement, relative to placebo, from baseline to week 24 in subjects with either Ctrough ≥4 ng/mL or by the pre-specified cut-off for analysis of 75th percentile for Ctrough within the study (5.4 ng/mL). This observation was supported by results in the secondary measure of episodic memory, WMS Immediate and Delayed Recall composites, in subjects on background AD therapy. Among these, neflamapimod-treated subjects having Ctrough levels above the pre-specified 75th percentile demonstrated significantly better outcomes on WMS Immediate and Delayed at weeks 12 and 24, than did those on placebo. The 4 to 5 ng/mL threshold is consistent with current understanding of the mechanism-of-action and potency of neflamapimod. When the trial was designed, there were no potency data available for neflamapimod in AD-relevant pharmacology; instead, the dose level of 40 mg was chosen based on effective doses of neflamapimod in aged rats [18]. Recent mechanistic studies indicate that a major pharmacological target of neflamapimod is endolysosomal dysfunction associated with the protein Rab5 [41]. In addition, neflamapimod was shown to block Aβ oligomer-induced, as well prion-induced, dendritic spine loss in hippocampal neurons [42]. The in vitro potency, EC50, of neflamapimod for reversing Rab5+ endolysosomal dysfunction and for blocking Aβ-oligomer or prion-induced dendritic spine loss is 5–12.5 ng/mL. As brain concentrations of neflamapimod in preclinical studies are approximately two-fold higher than in the plasma, those potency concentrations are overlapping with predicted brain concentrations based on a plasma drug concentration of 4 to 5 ng/mL.
The trial was designed and powered with a hypothesis, based results in the preclinical models and in the two phase 2a clinical trials, that there would be substantial improvement from baseline in episodic memory function within the neflamapimod treatment group. The sample size in the current trial was based on the results of the phase 2a trials where, in both the 16-patient, 12-week trial (Study 302) that assessed episodic memory with the WMS and the 9-patient, 6-week trial (Study 303) that assessed episodic memory with the HVLT-R, the effect size for improvement from baseline exceeded 0.6. Based on those results, the sample size for the current study provided >80% statistical power for an effect size of 0.45 for the difference between neflamapimod treatment and placebo. Thus, with the decline of 0.15 from baseline to week 24 in the HVLT-R Z-score in the placebo-group in the current trial, an improvement from baseline to week 24 within the neflamapimod treatment group of at least 0.3 in the HVLT-R A-score of 0.3 was required to demonstrate a statistically significant effect on the primary endpoint, an effect not seen. This difference in outcome between the current trial and the two earlier phase 2a studies may be related to the two major differences between the current trial and phase 2a: (1) the patients in the earlier studies were less advanced in the disease and not receiving background AD therapy, and (2) the phase 2a studies included a higher dose of neflamapimod and achieved higher blood drug concentration levels. An impact of background therapy on the outcome is suggested by the neflamapimod participants not on background therapy with Ctrough ≥ 75th percentile having an improvement from baseline of 0.45 Z-score on the primary endpoint (see Supplemental Figure 1). However, there were only five participants included in this analysis and, so, no conclusions can be drawn from it. With regard to the doses utilized in phase 2a, in Study 302 that utilized the WMS, nine participants received the same 40-mg BID dose as in the current study and seven received 125-mg BID (note: due to differences in excipient ratios utilized in the drug capsules, the 125 mg resulted in plasma drug levels, on average, only 50% higher than with 40-mg BID). Within Study 302, a statistically significant PK–PD relationship was established [27], a relationship indicating that plasma drug levels resulting from 125-mg BID were associated with greater improvement in WMS scores. In retrospect, and after obtaining the WMS results in the current study, the “improvement” seen in the 40 mg BID group likely resulted primarily from practice/learning effects, though the additional improvement at 50% higher plasma drug levels may still have been related to neflamapimod treatment. In the other phase 2a study (Study 303), in which positive effects on HVLT-R were demonstrated, the average plasma drug exposure with 40-mg BID was approximately 50% higher than in the current study because extremely low weight subjects were included. We believe that the results in Study 303 were not due to practice effects, as there were no practice/learning effects evident on the HVLT-R in the current study, including in the participants not receiving background therapy. Overall, though limited by the small number of participants in the phase 2a studies, the plasma drug concentration–effect relationships in phase 2a are generally consistent with the PK–PD analyses in the current study and suggest that the therapeutically active dose range is at least 50% higher than 40-mg BID (e.g., ≥ 40-mg TID).
With a specific kinase inhibitor, where both biomarker and clinical effects would depend on inhibition of that target kinase, the expectation would be that the dose–response would be similar for the biomarker and clinical effects. In this case, the differences between neflamapimod and placebo effects on the biomarkers were modest (~5%) which, in the absence of a clinical effect, suggests that the 40-mg BID dose level is simply at the lower end of the pharmacologically active dose range. Indeed, biomarkers are, by design, intended to be more sensitive than clinical effects and it is not unusual to see biomarker effects preceding clinical effects with respect to dose (i.e. at a lower dose) and/or time. For example, with aducanumab, significant effects on brain amyloid plaque by PET scan were seen at 3 mg/kg, though the clinical effects are limited to a dose of 10 mg/kg, where a moderately greater effect on amyloid plaque reduction is also seen [43]. Furthermore, at the clinically efficacious 10 mg/kg dose level, the majority of the effect on amyloid plaque load is seen by 26 weeks, while the clinical effect is not evident until week 52. To determine whether such relationships between biomarker and clinical effects exist for neflamapimod in AD will require further upward dose-ranging to first establish a clinical effect. In addition, a longer duration clinical trial may show a more distinct plasma drug concentration–effect relationship (i.e., dose–response), in terms of CSF biomarker effects, than we were able to demonstrate in the current study.
Non-CNS AEs, particularly aminotransferase elevations, have limited development of p38 MAPK inhibitors for peripheral inflammatory disorders [44]. Neflamapimod has the potential to minimize such toxicities, while maintaining robust pharmacological effects in the brain. The reasons for this include that plasma drug concentrations are half that in the brain and the drug is 95% protein-bound in whole blood, further reducing peripheral effects, as protein-binding decreases its potency three-fold. Our results support this concept, as only one of 78 neflamapimod recipients developed aminotransferase elevation to 3 times ULN, while pharmacological activity was demonstrated by the CSF biomarker results. Further, as the incidence of aminotransferase levels ≥ 3 times ULN was approximately 15% in a prior study of neflamapimod in rheumatoid arthritis patients, at a dose of 250 mg twice daily (Ctrough approximately 30 ng/mL) [44], a low incidence of liver enzyme elevation is expected with dosing regimens that would consistently achieve Ctrough ≥4 ng/mL.
Limitations
This trial has limitations. First, the 24-week duration of the trial was not designed to ascertain effects on clinical disease progression. The sample size was effectively further attenuated because only a minority of subjects achieved plasma drug levels in the identified potentially therapeutically active range. In the PK–PD analysis, the sparse sampling approach utilized did not provide sufficient information to develop a robust population PK model, which would have allowed for a more thorough evaluation of the relationship between outcomes and PK parameters other than Ctrough. Our two measures of episodic memory each has its respective strengths and limitations. The WMS as a composite of three different cognitive tests provides the more comprehensive assessment of memory function. In addition, having three modestly correlated cognitive tests inherently decreases variability for the composite assessment. However, with the repeated application of the WMS, which has no alternative versions (that is, the same version is applied at each visit), over the relatively short-time period of the study, we saw substantial practice effects in treatment naïve patients that precluded any ability to discern neflamapimod effects. With the HVLT-R, which has alternative versions, at a group level, there were very little practice effects, as there was very little change in mean HVLT-R Z-scores over the 24 weeks of the study. However, with a single test, there was substantial within-subject variability from visit to visit. For example, within the placebo group, from baseline to week 6, nearly all scores for subjects above the median at baseline decreased, while those for all the subjects below the median increased; that is, there was substantial regression to mean. One approach to handling such variability would be to have more than one assessment at baseline, for example, one during the first screening visit and one on day 1 before starting treatment. The optimal approach would be to have a cognitive testing battery composed of three or more distinct episodic memory tests, each with alternate versions.